THE ANNALS 
of 
PROBABILITY 


AN OFFICIAL JOURNAL OF THE 
INSTITUTE OF MATHEMATICAL STATISTICS 


Articles 


Bee tral measure of large random Hankel, Markos and Toeplitz matrices 
WLODZIMIERZ BRY C, AMIR DEMBO AND TIEFEN O DHANG 

On random almost periodic trigonometric polynomials and applications 
tO GEPOOTC MOODY: Lco ey Rr eR reed GUY COHEN AND CHRIS 1OPIO CI NS 
Maxima of asymptotically Gaussian random fields and moderate deviation appros nations 

to boundary crossing probabilities of sams of random variables with 





multidtinenstonal indices... aaua aaa HOCK PENG CHAN AND TZE 1. ve Ls 
A Gaussian kinematic formula........... JONATHAN E, PAS} OR 
Notes on the two-dimensional fractional 

Brownian motion... l.l eee eo 340 DOIN AND DAVID Ni: AL ARI 
Random growth models with 

polvgonal SUAS dickens eae tiet et GRAVNER AND DAVID GREH BEATH 


Late points for random walks in two dimensions 
AMIR DEMBO. YUVAL PERFS, JAY ROSEN AND OFFER Zt HI OU NI 


Gaussian estimates» for spatially inhomogeneous 


random,walks on Z4... esses sese SAME MESIAPHA 
On the structure of solutions of ergodic type Bellman equation related 

to risk-sensitive como... 0.0.6.2. 2066- HIDE HIRO KARE AND SIN ENSV-S 51 SHE! 
Large deviation for diffusions and Hamilton-Jacobi 

equation in HUPA SACS cna duse REX VERO ERU Ee E Hoa Edo AG Te ON UN PENG 
Finitely additive beliefs and universal type spaces oo... 000... 0. eee NM SERES. MEHR 


Correction Notes 


improper regular conditional distributions 
TEDDY SEIDELNFELD. MARK J. SCHERVISH AND JOSEPH B. ADANI 
Central limit theorems for additive functionals 
of the simple exclusion process o.0 0.0. ee eee neces S.SEPHE RAMAN 





Vol. 34, No. 1—January 2006 


ae 
p 
woe 


INSTITUTE OF MATHEMATICAL STATISTICS 


T ere eee eee e H E odi 
Organieed Csete 12. 19 535 
E D wee pm Taga. i: x "m FN H a fee a es Fy 3 72353 ^ SEM = i5 E : - . p " z n E 
dite PEPIN tile FRSTH G Piin DUE CURRENT and dissemination ef Hie theory eud 


up. MIAN OF Sois in vonetiite. 


wr 


President: dhonne Ga Run dccpartewehi of Ahienutes Universite af Wiwonit Madison, Madison. 
Wisconsm M SHS 

Presilent-Eteet: Jas Pitnans demaneu of Subteno dj nnwosdds ve Caniformia, Borkeles 
TOO ANOO ý 

Past President: Louis GL Y, Chea. “attra: Daven of Staeapore, Lastitete for Mathematica! Science. 3 Prince 
Georges Park. Singapore FIN gU 


Executive Secretary: Cini y DL Cionestiosen, Depueis so ob Health Semicos. Boston Universit, MIO Springs 
Rankhi iin. Bodens, Missa iinet TIO 


Treasurer? Dayane Sun, amie cnt ws šalies. dua Weste Reserve Universuy. C tevelims, Dhia 
ALG. 7053 f 

Program Seceetars: Nadeew Nobi Depaanesp ot Spt cosbss, Univeréty of North Carobna. Now Wes Building, 
Chapel HGL onh Coraling 1 7390.3, 6] | 

Editors, Flic Apis iis Siatisdivso Mow L. bann. Posoerioeorat Sütistics. University of Minnesota, 224 Church 
Sheet 8, Eu Minmeanalis. 1 ünnosochs AMAS issesging Fan. Depuronent of Operations. Rescarch and 
Prenat Uugivoenims .Vriccion tos ets dy Pintat n New deran ORALA 

Editor, Jhe yras s Prohabii Geeren Poc 0s Deparocat ef Mauuhemanes Cornell University, 
un Has. Whoce. “ew You E i23 MM : 

Editar, Fhe Apani of Applied Cecheiin: bpdwans © Vaymiie. DLopartinent of Matboniaties, Oreven State 
i vae, Cosas. Oregon POT l 

dit i SOMO! Seieuce award t aes ey Pepanitieni ci Statistics Unisensity ob Ponnss viria, Pribwleinhia., 

PEM) Eo la LOO 6341) 
aan Hie M (t id (un: Dormir Aber, St. Peters okera, Ofydd OAT JDI. United Kinpdem 
Fditon GUS Lectire Soto — Maiuno oi Koria Ragnar’ cuie Depart ent of siatisties, Lniversiy af Crocacetic 


i-e Te OIBWRCUCUL in Ine 
tditor PAIS Viet face. tennant E ro dene (i Beans des aad Epuloaniologv. Cleveland. Chime 


Foundation. Clovchand Obs (547 

Managing Editor, Srauvos, Paul shaman Department o! Miaiwes. Uneeisity of Pennsylvania. Phizufelphia, 
a T Paes d \ 
Pemas Var EATERS 

Minutising k tor. Prabir Missi Puen, Donor? o Mathientaties, Chapman Universtiy. One Universitiy 
Dr V. Oran ge. 0 anto HE Ops wy 


HE 


Journals. ^ SIN «cdendtie journals ot the dJasutiss ci feo rng of Applied Prebabilüv. Pie Vunls ot 


* 3 ^ 23 n 1 : r 3 3 eg er Ld: " et? , A à , S * lv (Y : d; 
Pyrefsd4iin, pne AUS ep AGR Oy und Nhu REI VEU div DEAS Ludi ia thy Insite iS OD She 
HT Kirle fi} Af aH Butt Sates fers ee 


t 


lindivicual and aN Meuilrersitigs. uiia! noemberi pay basic membership: dues 
foS?5. Bach padividual camber piai chet ta resis ab waat ome scheme journa for an addition 
anmeunt, gs (alloys, fa 4 ou es Angie’ Probubscscs Ost The Annais af Probability (S25), The Aniti 
ef NASATE CRUS DI Walbacai Me sce SS ESL OP ins tory dues pid, S20 is allocated to Tre IAS Balletia 
send the cemaieie amount s aliesmec a qually noice She selenite gourmal(s) received. Reduced mem- 
bursbip dacs are matable to cut ipne skeins, poco resideris er countries. designated bs the IMS 
Cou id amd retired m Moth naumbel. oed: aleet de cueste the Bulle oniy cob $26. 
Orociizitienmal membersfiins aeo coanable qe nenpreos:oongkanméufhon at 56G per year aud [do jQorn-ppnaht 
OPEL Gbens db 5A per year i9 ean raza mal HE Bess peire teo muitipteaeadership. copies of 
alf. [NIS Sournales in ihlfitios to on er beneits spectise Jer vwb oolegery (details available from the IMS 
Hus ss OE. Y. m 


Di 


iii Ls 


Individual and General Subsertctionr.. Sulseripias - gU wedlluhes on a calendar-sear basis. Judie wtiai 
SARTRE "L riptios are for Pas persed css ob fhe subscr: aad must 6c dn the name of, paid direct y bs. uni 


DEM: RM H ved pe re. 4 E 8 " Di dq 5 Pu Rt 3 SN Tam s far DES; ^ J į ipw 
Dat veu to an mdi ibit nun uiu! BRUM "pb TEs fay cave rbd SAU Te MO i P Ann dS du. Aj anfi d P m dédié 
QS LO II uin: pe TERI Sees dune Vea Gp ea es Sale X3 is Rather (S00 
mun Siutiti vul ckan e iv (operu SpbBe piiat ocg ler pirar, JUSTO and agy anus ple 


te dersfin ise General sabseripitirs tar AO ani oo wr Fe nafs m Applied Probalbiriim GU "th, The 
Mains aii Prod ITEAN pe ae Fie 4 pal & ouf y. qUSIb TSO Jhe NIS Buller (S78 and Shatisiicad Science 


(LER, Adr mall rates ioriieli en oct ida North Americn are S80 per bile texciidiug The AY Bulletin. 


Permissions Podies. Xaihorigzitiorn iu paan aps ak cr cpogndedtodb or personal use. op the momeli or 
XreGnut use gf speeiüs chants, pP ran id hy He die f MVfhemaical St vw provided chat the 
)SQreddude NC og SUMUS. chiens, | Talis Vy ay eo Indus aus lic «S provicco Chat ute 
base dee af 68.001 per veepss pius st. poro page m Poss ire chy ae The Copyneht Clesrmcee Cere 232 
Roseweooc Drive. Danvers. Maüssachuscis. 0103 fecal oe consogegbt.sem. For ja organizZaiiens that 
have been granted. a poA Bess US CUL. a ea sstem of payrment has been arranged. Che fee 


y 
i 
code for users of the Erisacionat beportiag Seivice o to Sibio 55.00 (100. 


^ 
* 
-— 
"n 
- 
ea 


RNC MN ] X4. D AT di. aca eus f, arte. TN co vou AL Ax 3 rmis 
Cormapendime. Mall coneerüips cron esate.  acccenponss adrecgot claims. copyright per mis- 
h . ae E at l AX eerte E Ss Bol ERT AS ] pp DERE. cok d chr Y 
SCHON. dyeriiss RE in ack patha Aho Bae SH xe thee HATS Thes ub Suite RA (Kio Uh 


Ray "eS Pihe, unc Lo ALL. Beds Marit QUOND Mute. Mi 4 COneernin: AN OF 
editor: content should be sar fe the Per oi che apporte journal. Thou addres E 
tated above, Mati oe ie predicen ob cius. ampal Stik) Be sont ote: Pairiek B 
[owuen Editon jXMnastsueni or cbe d sas gl Besse d Philadelphia, Penns 
Sitios. 


The Vunufs of Prabubiluy (sS N GUYS -PTUR), volui H. Nutüber f. Januaey 2006, Published Bimontily by 


a n 7 C AA ave 3 1 
in: S jpa ifie ol Math ri SETON E E RT aes A MT MM a Sl a aU. [3i 4 j M. US Xs Pe UH radicals ph wace 
paid ot Cleveland) Ono, aad eh sir icaciling uite. 


POS UAGASTEIOR: scii addres classes te Ite dmn on Ba bedalin Site of Mahemativat siatie s. Dues 


L ` 
Vr Ed iru. g Pre A AR s Polska dank à uM ERN SUP NES E POKER to 
aud Sebsenntames tios, SET Wo bees Ee Spe 125,5 Bothoada Nous End 28 PE 398. 





EDITORIAL STAFF. - 
EDITOR : 
GREGORY F. LAWLER 





ASSOCIATE EDITORS 
ANDREW BARBOUR OLLE HÄGGSTRÖM `- - — DANIEL L. OCONE | 
ITAI BENJAMINI FRANK DEN HOLLANDER KAVITA RAMANAN 
KRZYSZTOF BURDZY TAKASHI KUMAGAI GENNADY SAMORODNITSKY 
MICHAEL CRANSTON STANISLAW KWAPIEN "E TIMO SEPPALAINEN 
PERSI DIACONIS l WENBO LI g ODED SCHRAMM 
BRUCE DRIVER RUSSELL LYONS > i CRAIG TRACY 
- ALISON ETHERIDGE JONATHAN MATTINGLY : OFER ZEITOUNI 
i CARL MUELLER 
EDITORIAL ASSISTANT 
GERI MATTSON 
MANAGING EDITOR 
MICHAEL PHELAN 


PRODUCTION EDITOR 
PATRICK KELLY ` 


PAST EDITORS . 
THE ANNALS OF MATHEMATICAL STATISTICS 


H. C. CARVER, 1930-1938 WILLIAM KRUSKAL, 1958-1961 
S. S. WILKS, 1938—1949 J. L: HODGES, JR., 1961—1964 
T. W. ANDERSON, 1950-1952 D. L. BURKHOLDER, 1964-1967 
E. L. LEHMANN, 1953-1955 Z. W. BIRNBAUM, 1967-1970 
T. E. HARRIS, 1955-1958 INGRAM OLKIN, 1970-1972 


THE ANNALS OF PROBABILITY 


RONALD PYKE, 1972-1975 BURGESS DAVIS, 1991--1993 
PATRICK BILLINGSLEY, 1976-1978 JIM PITMAN, 1994-1996 

R. M. DUDLEY, 1979-1981 S. R. S. VARADHAN, 1996-1999 
HARRY KESTEN, 1982-1984 THOMAS G. KURTZ, 2000-2002 


THOMAS M. LIGGETT, 1985-1987 STEVEN LALLEY, 2003-2005 
PETER NEY, 1988-1990 


EDITORIAL POLICY 


The main purpose of The Annals of Probability The Annals of Applied Probability aud 
The Annals of Statistics is to publish contributions to the theory of probability and statistics 
and to their applications. Tbe emphasis is on importance and interest; formal novelty and 
mathematical correctness alone are not sufficient. Also appropriate are authoritative expository 
papers and surveys of areas in vigorous development. All papers are refereed. 


NOTICE 


Manuscripts submitted for publication in The Annals of Probability should be submitted electroni- 
cally. Authors should access the Electronic Journal Management System oe) at hitp://www.e- 


publications.org/ims/submission/. 
30824 


IMS ORGANIZATIONAL MEMBERS 


ACADEMIA SINICA 

ARIZONA STATE UNIVERSITY 
AUSTRALIAN NATIONAL UNIVERSITY 
BATH UNIVERSITY 


BATTELLE PACIFIC 
NORTHWEST NATIONAL LABORATORY 


BOWLING GREEN UNIVERSITY 

CARLETON UNIVERSITY 

CENTRUM VOOR WISKUNDE EN INFORMATICA 
CALIFORNIA STATE UNIVERSITY, EAST BAY 


CHALMERS UNIVERSITY OF TECHNOLOGY 
& GOTEBORG UNIVERSITY i 


Eat it E 
X . 


CORNELL UNIVERSITY ; B 
DUKE UNIVERSITY 

EINDHOVEN UNIVERSITY OF TECHNOLOGY 
FIOCRUZ--FUNDACAÁO OSWALDO CRUZ 
FLORIDA STATE UNIVERSITY 

FU JEN CATHOLIC UNIVERSITY 

HARVARD UNIVERSITY 

HIROSHIMA UNIVERISTY 

INDIAN INSTITUTE OF TECHNOLOGY 
INDIAN STATISTICAL INSTITUTE 

INDIANA UNIVERSITY 

INSTITUTE FOR DEFENSE ANALYSIS 

IOWA STATE UNIVERSITY 

ISTITUTO PER LE APPLICAZIONI DEL CALCOLO 
JOHNS HOPKINS UNIVERSITY 

KANSAS STATE UNIVERSITY 


LONDON SCHOOL OF ECONOMICS 
& POLITICAL SCIENCE 


LUND UNIVERSITY 


MASSACHUSETTS INSTITUTE 
OF TECHNOLOGY 


MATHEMATICAL SCIENCES 
RESEARCH INSTITUTE 


MCGILL UNIVERSITY 

MEDICAL COLLEGE OF WISCONSIN 

MEMORIAL SLOAN KETTERING CANCER CENTER 
MICHIGAN STATE UNIVERSITY j 
MINNESOTA STATE UNIVERSITY 

NANZAN UNIVERSITY 

NATIONAL SCIENCE FOUNDATION 
NATIONAL CENTRAL UNIVERSITY 
NATIONAL CHENG KUNG UNIVERSITY 
NATIONAL CHIAO TUNG UNIVERSITY 
NATIONAL SECURITY AGENCY ACS 
New MEXICO STATE UNIVERSITY! >° V] 
NORTH CAROLINA STATE UNIVERSITY 
NORTH DAKOTA STATE UNIVERSITY 


NORTHERN ILLINOIS UNIVERSITY 


Am 
4 Ls i0 
4 i 


b 
^- 


. NOTTINGHAM TRENT UNIVERSITY 


OREGON STATE UNIVERSITY 
PENNSYLVANIA STATE UNIVERSITY 

PFIZER INC. 

PRINCETON UNIVERSITY 

PURDUE UNIVERSITY 

QUEENS UNIVERSITY 

RICE UNIVERSITY 

ROCKEFELLER UNIVERSITY 

RUTGERS UNIVERSITY 

SIEGEN UNIVERSITY 

SOUTHERN ILLINOIS UNIVERSITY 
STOCKHOLM UNIVERSITY 

TECHNISCHE UNIVERSITAT 

TEXAS A&M UNIVERSITY 

TEXAS TECH UNIVERSITY 

UNITED STATES, DEPARTMENT OF DEFENSE 
UNIVERSIDAD AUTÓNOMA DE MADRID 
UNIVERSIDADE DE COIMBRA 

UNIVERSITA COMMERCIALE LUIGI BOCCONI 
UNIVERSITA DELGI STUDI DI PADOVA 
UNIVERSITA DELGI STUDI DI ROMA LA SAPIENZA 
UNIVERSITAT BERN 

UNIVERSITÁT KARLSRUHE 

UNIVERSITAT MÜNSTER 

UNIVERSITÀT ZU LÜBECK 

UNIVERSITY OF ALBERTA 

UNIVERSITY OF ARIZONA 

UNIVERSITY OF BRITISH COLUMBIA 
UNIVERSITY OF CALGARY 

UNIVERSITY OF CALIFORNIA, IRVINE 
UNIVERSITY OF CALIFORNIA, LOS ANGELES 
UNIVERSITY OF CALIFORNIA, SAN DIEGO 
UNIVERSITY OF CALIFORNIA, SANTA CRUZ 
UNIVERSITY OF CONNECTICUT 
UNIVERSITY OF DENVER 
UNIVERSITY OF EDINBURGH 
UNIVERSITY OF FLORIDA 
UNIVERSITY OF GEORGIA 
UNIVERSITY OF ILLINOIS 
UNIVERSITY OF IOWA 
UNIVERSITY OF MASSACHUSETTS 
UNIVERSITY OF MICHIGAN 


s UNIVERSITY OF MINNESOTA 


| UNIVERSITY OF MISSISSIPPI 
UNIVERSITY OF MISSOURI 
UNIVERSITY OF MONTREAL 
UNIVERSITY OF NEW BRUNSWICK 


UNIVERSITY OF NEW MEXICO 
UNIVERSITY OF NORTH CAROLINA 
UNIVERSITY OF OREGON 
UNIVERSITY OF OTTAWA 
UNIVERSITY OF OXFORD 
UNIVERSITY OF PENNSYLVANIA 
UNIVERSITY OF PITTSBURGH 
UNIVERSITY OF SOUTH CAROLINA 


VE 


UNIVERSITY OF TEXAS, DALLAS 
UNIVERSITY OF TEXAS, HOUSTON 
UNIVERSITY OF VICTORIA 

UNIVERSITY OF WASHINGTON 
UNIVERSITY OF WATERLOO 

VIRGINIA COMMONWEALTH UNIVERSITY 
WAYNE STATE UNIVERSITY 

YORK UNIVERSITY 


THE ANNALS OF PROBABILITY 


INSTRUCTIONS FOR AUTHORS 


Submission of Papers. Papers must be submitted 
electronically. Authors should access the Electronic 
Journal Management System (EJMS) at http://www.e- 
publications.org/ims/submission/. If you are a first time 
user you must complete the registration. You are only re- 
quired to register once. 

After the registration is complete you will have the 
option to submit your manuscript. After completing the 
form you will then upload your PDF file. 


gu) 


Preparation of Manuscripts. Authors using LaTeX 
should begin their document with 

Xdocumentsclass[11pt,leqnolíarticle] 
and produce a double-spaced manuscript with the com- 
mand 

\usepackage{setspace}\doublespacing. 

For further information on preparing your manuscript, 
please see http://www.imstat.org/aop/manprep.htm where 
you will find LaTeX support page for IMS publications 
to use the IMS recommended template. 


Submission of Reference Papers. Four copies of 
unpublished or not easily available papers cited in 
the manuscript should be submitted with the manuscript. 


Title. The title should be descriptive and as concise 
as is feasible, that is, it should indicate the topic of the 
paper as clearly as possible, but every word in it should 
be pertinent. 


Abbreviated Title. An abbreviated title to be used 
as a running head is also required. This should 
normally not exceed 35 characters. For example, an 
article with the title “The Curvature of a Statisti- 
cal Model, with Applications to Large-Sample Likeli- 
hood Methods,” could have the running head “Curvature 
of Statistical Model” or possibly “Asymptotics of Like- 
lihood Methods” depending on the emphasis to be con- 
veyed. 


Affiliation. Indicate your present institutional affiliation 
as you would like it to appear. 


Summary. Each manuscript is required to contain a 
summary, clearly separated from the rest of the 
paper, which will be printed immediately after the 
title. Its main purpose is to inform the reader quickly 
of the nature and results of the paper; it may also be 
used as an aid in retrieving information. The length 
of a summary will clearly depend on the length and 
difficulty of the paper, but in general it should not 
exceed 150 words. Formulas should be used as spar- 
ingly as possible within the summary. The summary 
should not make reference to results or formulas. in 
the body of the paper—it should be self-contained. 


Footnotes. Footnotes should not be used, except as 
described under Title Page Footnotes below. Such 
information should be included within the text. 


Title Page Footnotes. Included as a footnote on page 1 
should be the headings: AMS 2000 subject classifica- 
tions. Primary-; secondary-. Key words and phrases. 

The classification numbers representing the primary ' 
and secondary subjects of the article may be found at 
www.ams.org/msc/. The key words and phrases shoulc 
describe the subject matter of the article; generally they 
should be taken from the body of the paper. - 

Acknowledgment of support, grants and contracts , 
should also be included in this footnote. . 


Figures, Figures are best prepared as separate postscript 
or encapsulated postscript files and should be included 
with the manuscript. 


References. Citations in text should be numbered 
*,..using examples shown in [1]...” and the bibliogra- 
phy should be styled to appear as follows: 


[1] LAMPORT, L. (1994), BIEX: A Document Preparation 
System, 2nd ed. Addison-Wesley, Reading, MA. 


[2] CHEN, X. (1999). How often does a Harris recurrent 
Markov chain recur? Ann. Probab. 27 1324-1346. 


Abbrevations for journals should be taken from 
a current issue of Mathematical Reviews or from 
http://www.ams.org/msnhtml/serials.pdf. 


Copyright, Page Charges and Offprints. Page charges 
are $45 per printed page. Payment of some or all 
of the estimated page charges associated with articles is 
strongly encouraged. The editorial review of articles and 
administration of page charges are completely separate 
activities. Manuscripts are reviewed and accepted prior 
to determining whether page charges will be paid. 

Every corresponding author will receive a pdf file via 
e-mail of the article. You do not need to do anything to re- 
ceive this file, it will happen automatically. Offprints may 
be purchased by using the IMS Offprint Purchase Order 
Form accompanying the galleys. 

Copyright Transfer and Page Charges forms are re- 
quired, Offprint forms are optional. We must have the re- 
quired forms before your article can be published. Note 
the address/fax number for each form on the form itself. 


Galley Proofs. Authors will receive e-mail notification 
when galleys are ready and have the option of either 
downloading a pdf version of the article or having it sent 
by regular mail. Similarly, authors may return corrections 
either by e-mail or regular mail. 


Correspondence. All correspondence with the editor 
must refer to the manuscript number of the paper. This 
number will be sent to the author acknowledging receipt 
of the article. 


The Annals of Probability 

2006, Vol. 34, No. 1, 1-38 

DOE: 10.1214/009117905000000495 

© Institute of Mathematical Statistics, 2006 


SPECTRAL MEASURE OF LARGE RANDOM HANKEL, MARKOV 
AND TOEPLITZ MATRICES! 


BY WLODZIMIERZ BRYC, AMIR DEMBO 
AND TIEFENG JIANG 


University of Cincinnati, Stanford University 
and University of Minnesota 


We study the limiting spectral measure of large symmetric random ma- 
trices of linear algebraic structure. 

For Hankel and Toeplitz matrices generated by i.i.d. random variables 
(X&] of unit variance, and for symmetric Markov matrices generated by i.i.d. 
random variables {X;;} j>; of zero mean and unit variance, scaling the eigen- 
values by ./n we prove the almost sure, weak convergence of the spectral 
measures to universal, nonrandom, symmetric distributions yy, yy and yr 
of unbounded support. The moments of yy and yr are the sum of volumes 
of solids related to Eulerian numbers, whereas ya has a bounded smooth 
density given by the free convolution of the semicircle and normal densities. 

For symmetric Markov matrices generated by i.i.d. random variables 
{Xij};>; of mean m and finite variance, scaling the eigenvalues by n we 
prove the almost sure, weak convergence of the spectral measures to the 
atomic measure at —m. If m = 0, and the fourth moment is finite, we prove 


that the spectral norm of M, scaled by ./2nlogn converges almost surely 
to 1. 


1. Introduction and main results. For a symmetric n x n matrix A, let 
à i (A), 1 < j € n, denote the eigenvalues of the matrix A, written in a nonincreas- 


ing order. The spectral measure of A, denoted ji(A), is the empirical distribution 
of its eigenvalues, namely | 


1 n 
= iyw 
j=l 


[so when A is a random matrix, A(A) is a random measure on (R, 8)]. 
Large-dimensional random matrices are of much interest in statistics, where 

they play a pivotal role in multivariate analysis. In his seminal paper, Wigner [24] 

proved that the spectral measure of a wide class of symmetric random matrices 


Received May 2004; revised Tanuary 2005. 


lSupported in part by NSF Grants INT-0332062, DMS-00-72331, DMS-03-08151, DMS-04- 
49365 and DMS-05-04198. 


AMS 2000 subject classifications. Primary 15A52; secondary 60F99, 62H10, 60F 10. 


Key words and phrases. Random matrix theory, spectral measure, free convolution, Eulerian 
numbers. 


1 


20 W. BRYC, A. DEMBO AND T. JIANG - 


of dimension n converges, as n — oo, to the semicircle law (also called the Sato- 
Tate measure, see [21] and the references therein). Much work has since been done 
on related random matrix ensembles, either composed of (nearly) independent en- 
tries, or drawn according to weighted Haar measures on classical (e.g., orthogonal, 
unitary, simplectic) groups. The limiting behavior of the spectrum of such matri- 
ces and their compositions is of considerable interest for mathematical physics 
(see [17] and the references therein). In addition, such random matrices play an 
important role in operator algebra studies initiated by Voiculescu, known now as 
the free (noncommutative) probability theory (see [12] and the many references 
therein). The study of large random matrices is also related to interesting ques- 
tions of combinatorics, geometry and algebra (see [9], or, e.g., [22]). In his recent 
review paper [1], Bai proposes the study of large random matrix ensembles with 
certain additional linear structure. In particular, the properties of the spectral mea- 
sures of random Hankel, Markov and Toeplitz matrices with independent entries 
are listed among the unsolved random matrix problems posed in [1], Section 6. We 
shall provide here the solution for these three problems. 

We note in passing that Hankel matrices arise, for example, in buon re- 
gression, as the covariance for the least squares parameter estimation for the model 
d b;x', observed at x = x1,..., x, in the presence of additive noise (see [20], 
page 36). Toeplitz matrices appear as the covariance of stationary processes, in 
shift-invariant linear filtering, and in many aspects of combinatorics, time series 
and harmonic analysis. See [10] for classical results on deterministic “Toeplitz ma- 
trices, or [7] and the references therein, for their applications to certain random 
matrices. The infinitesimal generators of continuous-time Markov processes on fi- 
nite state spaces are given by matrices with row-sums zero (which we call Markov 
matrices). Such matrices also play an important role in graph theory, as the Lapla- 
-= cian matrix of each graph is of this form, with its eigenvalues related to numerous 
graph invariants; see [15]. 

We next specify the corresponding ensembles of random matrices studied here. 
Let (Xy :k =0,1,2,...} bea sequence of i.i.d. real-valued random variables. For 
n € N, define a random n x n Hankel matrix Hy, = [Xi..;-1]1«i, j «n. 


Xi X2 m Rb Mes X, 
X2 X3 Xn Xn+i 
E Xni Xah 
(1.1) H, = l ; 
Xn—2 Apis ^ ! 
Xn-1 Xn X2n-3  X2n—2 
Xn Xn]. c5 005 Xa2 X-i 


Pid 


HANKEL, MARKOV, TOEPLITZ MATRICES 3 


and a random n x n Toeplitz matrix T, = [X\j—j)Ji<i,j<n, 


Xo Xi X2 ©- Xn Xn-1 
AL. Xo X1 Xn—2 
(1.2) he Se uO 
: I X2 
Xn-2 Xo Xi 
Xn-1 Xn-2 ++: X2 Xi Xo 


The limiting spectral distribution for a Toeplitz matrix T, is as follows. 


THEOREM 1.1. Let (Xy:k =0,1,2,...} be a sequence of i.i.d. real-valued 
random variables with Var(X4) = 1. Then with probability 1, i(T;,/./n) con- 
verges weakly as n — oo to a nonrandom symmetric probability measure yr which 
does not depend on the distribution of X1, and has unbounded support. 


The spectrum of nonrandom Toeplitz matrices, the rows of which are typically 
absolutely summable, is well approximated by its counterpart for circulant ma- 
trices (cf. [10], page 84). In contrast, note that the limiting distribution yr is not 
normal as the calculation shows that the fourth moment is m4 — 8/3. This differs 
from the analogous results for random circulant matrices (see [4]), a fact that has 
been independently noticed also in references [3, 11]. 

Our next result gives the limiting spectral distribution for a Hankel matrix H,. 


THEOREM 1.2. Let (Xy:k =0,1,2,...} be a sequence of i.i.d. real-valued 
random variables with Var(X4) = 1. Then with probability 1, ((H,/./n) con- 
verges weakly as n — oo to a nonrandom symmetric probability measure yy 
which does not depend on the distribution of X1, has unbounded support and is 
not unimodal. 


(Recall that a symmetric distribution v is said to be unimodal, if the function 
x +> v((—oo, x]) is convex for x < 0.) 


REMARK 1.1. Theorems 1.1 and 1.2 fall short of establishing that the lim- 
iting distributions have smooth densities and that the density of yy is bimodal. 
Simulations: suggest that these properties are likely to be true; see Figure 1. 


REMARK 1.2. Consider the empirical distribution of singular values of the 
nonsymmetric random n x n Toeplitz matrix Ry = [Xi—jli<i,j<n. It follows from 
. Theorem 1.2 that as n — oo, with probability 1, 2((R,RZ)1/*/./n) — v weakly, 


4 W. BRYC, A. DEMBO AND T. JANG 


3000 
2500 
2000 


1500 





"(Hoo / 500 y pow V 500 0) 


FIG. 1. Histograms of the empirical distribution of eigenvalues of 100 realizations of the Hankel 
and Toeplitz matrices with standardized triangular U — U' entries. 


where v([0, x]) = yu ([—x, x]), x > 0. Indeed, let Jn = [Li+ jan+1]1<i,j<n, noting 

that J, x RZ is the Hankel matrix H, for (Xj, :k = 0, 1,....} to which Theo- 
rem 1.2 applies. Since J? = L,, and both J, and J, x RI are symmetric, we have 
R, R? = (R,J,)  J, RI = H2. Thus the singular values of matrix R, are the ab- 
solute values of the (real) eigenvalues of the symmetric Hankel matrix H,,. 


We now turn to the Markov matrices M,. Let (Xi; : j =i z 1] be an infinite up- 
per triangular array of i.i.d. random variables and define X j; = X;; for j >i — 1. 
Let M, be a random n x n symmetric matrix given by 


(1.3) M, = X, — Dn, 


where X, = [Xi;]i<i,j<n and D, = diag( 7 |] Xij)1«ixn is a diagonal matrix, so 
each of the rows of. M, has a zero sum TRUE that the values of Xj; are irrelevant 
for Mn), that is, 


n 

-» Xij X12 X13 pex Xin 

n 
X21 -` X; X» ki X2n 
jz2 
M, = i ` n ° 
Xii Xio e =% Xy ce Xkn 
jk 

Xni Xn2 ~~ y» Xnj 


HANKEL, MARKOV, TOEPLITZ MATRICES 5 


Wigner's classical result says that 4(X,/./n) converges weakly as n — oo 
to the (standard) semicircle law with the density 4/4 — x*/(2z) on (—2, 2). For 
normal. X, and normal i.i.d. diagonal D, independent of X,, the weak limit of 
AX, — D,) / 4/n ) is the free convolution of the semicircle and standard normal 
measures;.see [17] and the references therein (see also [2] for the definition and 
properties of the free convolution). This predicted result holds also for the Markov 
matrix M,,, but the problem is nontrivial because D, strongly depends on X,,. 


THEOREM 1.3. Let (Xij:j = i 7 1} be a collection of i.i.d. random vari- 
ables with EX 2 = 0 and Var(X12) = 1. With probability 1, (M, / /n ) converges 
weakly as n — oo to the free convolution yy of the semicircle and standard nor- 
mal measures. This measure yy is a nonrandom symmetric probability measure 
with smooth bounded density, does not depend on the distribution of X15 and has 
unbounded support. 


If the mean of X;; is not zero, the following result is relevant. 


THEOREM 1.4. Let(Xij:i,je€ N, jzizl1l)beacollection of i.i.d. random 
variables with EX12 =m and EX?, < oo. Then ji(M,,/n) converges weakly to 
Ôm ASN — OQ. 


Turning to the asymptotic -of the spectral norm ||M, T :— max(A1(Mg), 
—An(M,)} of the symmetric matrix M,, that is, the largest absolute value of its 
eigenvalues, we have the following 


THEOREM 1.5. Let(Xij:i, j EN, j >i = 1} be a collection of i.i.d. random 
variables with EX12 — 0, Var(X12) = 1 and EXT, < oo. Then 


' lim Ma = a.s 
n>% /2nlogn E 


If the mean of Xj; is not zero, the following result is relevant. 


COROLLARY 1.6. Suppose EX;2 =m and EX4, < oo. Then 
" IM 
mM m 


n> n 


[m | a.s. 


Theorem 1.5 reveals a scaling in n that differs from that of the spectral norm of 
Wigner’s ensemble, where under the same conditions, almost surely, 


(14) js IX 


n— OO Jn 


ES 


(cf. [1], Theorem 2.12). As shown in Section 2 enroute to proving Theorems 1.4, 
1.5 and Corollary 1.6, this is due to the domination of the diagonal terms of M, in 
determining its spectral norm. 


6 — W. BRYC, A. DEMBO AND T. JIANG 


REMARK 1.3. The asymptotic of the spectral norm of random Toeplitz T, 
and Hankel H, matrices is not addressed in this work. 


.' Theorems 1.4, 1.5 and Corollary 1.6 are proved in Section 2. The proofs of The- 
orems 1.1 and 1.2, which are similar to each other, ultimately rely on the method 
of moments and the well-known relation 


J ados “tra! 


for an n x n symmetric matrix A. We begin in Section 3 by introducing the com- 
binatorial structures which describe the moments of the limiting distributions. 
(Proofs of the properties of the limiting distributions are postponed to the Ap- 
pendix.) Then in Section 4.1 we use truncation arguments to reduce the theorems 
to the case when the expected values of the moments of the spectral measures are 
finite. In Section 4.2 we show that under suitable integrability assumptions the ex- 
pected values of moments of the spectral measures converge to the corresponding 
expressions from Section 3 as the size of the matrix n — oo. Representing the mo- 
ments as traces, we use independence of the entries and combinatorial arguments 
to discard the irrelevant terms in the expansions (4.7) and (4.12). In Section 4.4 
we show that the moments of the spectral measures are concentrated around their 
means, which allows us to conclude the proofs in Section 4.5. 

The proof of Theorem 1.3 follows a similar plan, with truncation argument in 
Section 4.1, followed by combinatorial analysis of expansion (4.17) for the traces 
and concentration of moments in Section 4.6. 


2. Proofs of Theorems 1.4, 1.5 and Corollary 1.6. We need the following 
result, which follows by Chebyshev's inequality from [18], Section 6, Theorem 5 
or [19], Section 5, Corollary 5. 


LEMMA 2.1 (Sakhanenko). Let (£j; i =1,2,...} be a sequence of indepen- 
dent random variables with mean zero and EE? = o. If EJE; |P < co for some 
p > 2, then there exists a constant C > 0 and (ni, i = 1,2,...], a sequence of 
independent normally distributed random variables with n; ~ N (0, o?) such that 


n 


C 
P( max 15. kl >x) S iX P 2. lē; 


for any n and x > 0, where Sy = x é; and Ty = 34 Ni- 


PROOF OF THEOREM 1.5. Hereafter let b(n) = ./2n logn denote the normal- 
ization function for Theorem 1.5. 


HANKEL, MARKOV, TOEPLITZ MATRICES 7 


It follows from (1.3) that |lIM, Il — IDa l| < IXa lll. So, by (1.4) and the defini- 
tion of D,,, it suffices to show that as n — oo, 








com i 1 S. 
(2.1) "i Po ste a.s 
We first show the upper bound, that is, 
(2.2) lim sup Wp < 1 a.s. 
l na> OO 


Note that {X;;; j = thi is a sequence of 1.1.d. random variables for each i > 1. 
By Lemma 2.1 and the condition that E|X12|* < oo, for each i > 1, there exists a 
sequence of independent standard normals (Y;j; j = 1) such that 
(2.3) n(n 

i=l k 


k 

Cn 

1a 2,08) — Yij) >x) B 
J^ 


for all x > 0 and n > 1, where C is a constant which does not depend on n and x 
(note that two sequences {Y;;; j > 1) for different values of i are not independent 
of each other). We claim that 


























] 
(2.4) Oo, = EN ng ax HÈ ij — Yij) |^ a.s, 
as n — oo. First, 
DR l mtl 29m4l 
max ax Ui < jas en Di — Yij) l 
By (2.3), for any ¢ > Q, 
m4 m+] k l 
P( max U; > e) < ona > (Xy — Yi) = ss) 
k k=1 E 
Ce 
R 


for some constant C; depending only on e. Since ¢ > 0 is arbitrary, by the Borel- 
Cantelli lemma, max2^5, Ug — 0 a.s. as m — oo, which implies (2.4). Let 


1 n 
17 : 


V, = —— m 
"bn iei |o 








By the definitions in (2.1) and (2.4), we have that W, < Un + Vn, so by (2.4) we 
get (2.2) as soon as we show that limsup,, ,,4 V, x 1. To this end, fix 6 > 0 and 


8 W. BRYC, A. DEMBO AND T. JIANG 
œ > 1/8. Then, 
1 
p( mat VW >1+ ) 


Yu 


(2.5) zs vee("ti r 
j=l 








> (1+ 8)b(m* ) 


(m4-1)* 


25 ij 


j=l 


<2(m+ ve 








z(- Do), 


where Lévy's inequality is used in the second step. Since Y;;’s are independent 
standard normals, £ :— (m + 1) 9? ye a Y; ; is a standard normal random vari- 
able. Thus, by the well-known normal tail estimate 


1 | 1 1 
(2.6) V i 7 en? < P(E >x) < xe! ^ for x 0, 


we see that 








P(E] = (1 -)(m + 1) ""b(m*)) < Cgm ^0? 
for some constant Cs > 0. Consequently, for some C; > 0 and all m, by (2.5), 
1 
p( “max V, >1+6] < Cim 9. 
With œô > 1, we have by the Borel-Cantelli lemma that 


lim sup rax v <1+6 a.s. 


m—> oo n-m 


It follows that lim sup,. , 44 V, < 1 + ô a.s. and taking ô |, 0 we obtain (2.2). 
We next prove that 


(2.7) lim int Wr = 1 a.s. 


To this end, fixing 1/3 > £ > ô > 0, let ne := [nt £] + 1. Then, 




















; 1 Ng n 
Wn = b(n) mad " 
(2.8) > —— nix Y Xij| — —— max we 
l SB E moet J| b(n) i=l = id 


=! Va. — Vn. 


HANKEL, MARKOV, TOEPLITZ MATRICES | 9 


By (22), limsup, , 4, Wn, < 1 a.s. Thus, with b(n,)/b(n) > 0 as n > oo, we 
have that 


b(ng) 
—Wa——— 
° b(n) 


Since (Xij; 1 <i € ng, ng < j <n) are iid. for any n > 1, it follows that 


(2.9) Vn.2 —0 ‘as. 


n--ng 


2 Xij 


(2.10) P(V, 1 < 1-36) = P( 
. ps 


x(1-— x) : 








With b(n) > ./n, by Lemma 2.1 there exists a sequence of independent standard 
normals {Y;} such that for some C = C (8) < oo and all n 








n—ng n-—ng 
(2.11) «(S 2. Xij- Yi ib) < «Cn |. 
j-21l 


Further, by the left inequality of (2.6) we have that for all n sufficiently large, 
i 


Combining this bound with (2.11) and (2.10) we get that for all 1 large enough 
P(V.1x1—38) < (1—-2n 969 4 Cn!) 
< (1—270-79y"* « 7. 


Recall that e > 6, implying that 5^,., P(V,,; < 1 — 36) < oo. By the Borel- 
Cantelli lemma, 


n-e 


2 Y; 


j=l 








zde 23e) « P(Y < (1 — 3. /2lgn) 


<1— 2n 0-79. 


liminfV,i21—38 as. 
n> oo 


This together with (2.8) and (2.9) implies that almost surely liminf, o5 Wn Z 
1 — 36, and the lower bound (2.7) follows by taking ô 4 0. O 


PROOF OF COROLLARY 1.6. Let M, denote the Markov matrix obtained 
when X;; = Xij — EX;; replaces X;; in (1.3). Obviously, 


(2.12) M, = M, -+ Y,, 

where Y, = [Yij] is the n x n matrix with Y;; = m —nm1;—;. Clearly, 41(Y4) — 0, 

A2(Y4) = +--+ = Ana Xn) = —nm, so | Y,, || = n|m|. By (2.12) and Theorem 1.5, we 
. have that 


Moll MY] Mel 


n n | | nh 


—0 


10 W. BRYC, A. DEMBO AND T. JIANG 
as n — oo. This implies that |M, || /n — [m] a.s. L1 


In the context of this paper, the next lemma is very handy for truncation pur- 
poses. 


LEMMA 2.2. Let {Xij:j >i > 1} be an PUE triangular array of i.i.d. ran- 
dom variables with EX45 = 0 and Var(X42) = 07. Let X jj ji = Xij fori < j and set 
Xi; =O for all i > 1. Then, 


5x(xx) ad a. 


i=l \j=1 
as n — oo. 


PROOF. Define 


n 
(2.13) Less > > Xij Xik- 
i=] 1<j<k<n 
Then 
n 


Dg p dome 
21x) 2232: at 
ix] \j i=l j=] 


By the strong law of large numbers, the first term on the right-hand side converges 
almost surely to o^, so it suffices to show that 


(2.14) — —0 a.s. 


To this end, denote by F, the o-algebra generated by the random variables 
[Xij, 1 x i, j <k}. Noting that 


n n 
Unsi-Un= D> Xaj Xare t 2 >, XiXipan: 
1zxj«kzn i=] j=l 
it is easy to verify that {U,:n > 1} is a martingale for the filtration (75 :n > 1}. 
Further, the n?(n — 1) /2 terms in the sum (2.13) are uncorrelated. Indeed, if 
i Æi and j <k, j' <k, then E(Xij XipXy p Xy) = 0 as at least one of the 
tonr variables in this product must be independent of the others. Thus, E(U7) < 
o ^n? (n — 1)/2 for any n > 2, and by Doob's submartingale inequality 





; EQ c^ 
: S nig? — m2eg?' 


P( max |U;|>m 


1<i<m? 


It follows by the Borel—Cantelli lemma, that almost surely 


Lys m^ max, [Uj| ^ 0, 


] «i zm 


HANKEL, MARKOV, TOEPLITZ MATRICES 11 
as m — oo. Since n ^|U,| < (m/(m — 1) ^Z, whenever (m — 1)* <n< m?, 
m > 2, we thus get (2.14). O 


Let dpi, denote the bounded Lipschitz metric 


QS) — dau.) e mp | f fau- f fdv: flo +f < i} 


where || flloo = sup, [f£ x), llf ln = supyzy 1f 6) — FOII/lx — yl. It is well 
known (see [8], Section 11.3) that dg, is a metric for the weak convergence of 
measures. For the spectral measures of n x n symmetric real matrices A, B we 
have 


1 n 
dpi (Ĝ(A), £(B)) < «pl $106,» - FABII < | 


j=l 


= 


os | 


n 
DA) - 2001. 
j=l 
By Lidskii’s theorem ([13], see also [1], Lemma 2.3) 


3 AJA) — Aj)? x tr((B — A)”), 
j2l 


SO 


tr((B — A)^). 


3 | = 


(2.16) dg (AA), à (B)) < 


PROOF OF THEOREM 1.4. We use the notation from the proof of Corol- 
lary 1.6 and write c? = Var(X11). By (2.12) and (2.16) the bounded Lipschitz 
metric (2.15) satisfies 


(2.17) der (iM, /n), &(Yn/n)) < (n 2M) ^^. 


Note that (X; ji 1 <i < j} are iid. random variables with mean zero and finite 
variance. By the classical strongumbers and Lemma 2.2 


2 
in 2 nO NISL ALLEE 
(218) n tM?) = (2 D X44 (> 2) | — 20^ as. 
nN” Azicjzn BH ist sli | 
as n — oo. Recall that all but one of the eigenvalues of Y, are —nm, hence 
A(Yn/n) converges weakly to 8... Combining this with (2.17) and (2.18), we 
have that almost surely, (M, /n) converges weakly to 6_,. O 


12 W. BRYC, A. DEMBO AND T. JIANG 


3. The limiting distributions yy, yy and yr. 


3.1. Moments. For a probability measure y on (R, 8), denote its moments by 


my) = J x*y (dx). 


The probability measures yg, yy and yr will be determined from their mo- 
ments. It turns out that the odd moments are zero, and the even moments are the 
sums of numbers labeled by the pair partitions of {1,..., 2k}. 

It is convenient to index the pair partitions by the partition words w; these are 
words of length |w| = 2k with k pairs of letters such that the first occurrences of 
each of the k letters are in alphabetic order. In the case k = 2 we have 1 x 3 such 
partition words 


aabb | abba abab, 
which correspond to the pair partitions 
{1,2}U {3,4} = {1,4} U{2,3} {1,3} U {2, 4} 


of (1, 2, 3, 4). Recall that the number of pair partitions of (1,...,2k] is 1 x 3 x 
x (2k — 1). | 


DEFINITION 3.1. For a partition word w, we define its height h(w) as the 
number of encapsulated partition subwords, that is, substrings of the form xw 1x, 
where x is a single letter, and wy is either a partition word or the empty word. 


For example, h(abcabc) = 0, h(abcbca) = h(abccab) = 1, while 
h(aabbcc) = h(abccba) = 3 (the encapsulating pairs of letters are underlined). 

In the terminology of Bozejko and Speicher [5], h assigns to a pair partition the 
number of connected blocks which are of cardinality 2. These connected blocks of 
cardinality 2 are the pairs of letters underlined in the previous examples. 

. In Proposition A.5 we show that the even moments of the free convolution yy 
of the semicircle and standard normal measures are given by 


(3.1) mu(w)- », 29m. 
w: wl|-2k 


For the Toeplitz and Hankel cases, with each partition word w we associate a 
system of linear equations which determine the cross section of the unit hypercube, 
and define the corresponding volume p(w). We have to consider these two cases 
separately. 


3.2. Toeplitz volumes. Let w[j] denote the letter in position j of the word w. 
For example, if w = abab, then w[1] = a, w[2] = b; w[3] =a, w[4] = b. 


HANKEL, MARKOV, TOEPLITZ MATRICES 13 


To every partition word w we associate the following system of equations in 
unknowns xo, X1, ..., X2k: 


x1 — X0 + Xm, — Xm,-1 = 9, 
if mı > 1 is such that w[1] = w[m;], 
X2 — X1 + Xm; — Xm3—1 = 9, 


if there is m2 > 2 such that w[2] = w[m»?], 


(3.2) 


Xi — Xi—1 + Xm; — Xm;-1 = 9, 


if there is m; > i such that w[i] = w[mi], 


Xok—1 — X2k—2 + Xok — X2k—1 = Q, 
if w[2k — 1] = w[2k]. 


Although we list 2k — 1 equations, in fact k — 1 of them are empty. Informally, 
the left-hand sides of the equations are formed by adding the differences over the 
same letter when the variables are written in the space "between the letters." For 
example, writing the variables between the letters of the word w — ababc..c.. we 
get 


(3.3) Tog babe", n chet 
The corresponding system of equations is 
x1 — X0 + X3 — X2 = 0, 


X2 — X1 +x4 — X3 — 0, 
(3.4) 
XS — X4 + Xn41 — Xn — 0, 


Since in every partition word w of length 2k there are exactly & distinct letters, 
this is the system of k equations in 2k + 1 unknowns. We solve it for the variables 
that follow the last occurrence of a letter, leaving us with k + 1 undetermined 
variables: xo, and the k variables that follow the first occurrence of each letter. 

We then require that the dependent variables lie in the interval 7 = [0, 1]. This 
determines a cross section of the cube /*+! in the remaining undetermined k + 1 
coordinates, the volume of which we denote by pr (w). For example, if w = abab, 
solving the first pair of equations (3.4) for x3 = xo — x1 + x2, x4 = xo, defines the 
solid 


(xy —x1--x?2eI)n(xge I) c P, 


14 W. BRYC, A. DEMBO AND T. JIANG 


which has the (Eulerian) volume pr (abab) = 4/3! = 2/3. 
We define measure yr as a symmetric measure with even moments 
(3.5) ma(yr) J, pr(w). 
w:|w|-2k 


From Proposition 4.5 below it follows that (3.5) indeed defines a positive def- 
inite sequence of numbers so that these are indeed the even moments of a proba- 
bility measure. Since max is at most the number (2k — 1)!! of words of length 2k, 
these moments determine the limiting distribution yr uniquely. 


3.3. Hankel volumes. We proceed similarly to the Toeplitz case. With each 
partition word w we associate the following system of equations in unknowns 


X05 Xi, xv. ADEs 
Xi XQ = Xm, + Xm4—1; 
if mı > 1 is such that w[1] = w[mi], 
X2 X1 = Xm, + Xm3—1; 


if there is m2 > 2 such that w[2] = w[mz2], 


(3.6) 
Xj + Xi—1 = Xm; + Xm;—1; 


if there is m; > i such that w[i] = w[m j], 


X2k—1 + X2k—2 = X2k + Xok—1, 
if wDk — 1] = w[2k]. 


Informally, the equations are formed by equating the sums of the variables at 
the same letter. For example, the:word abab with the variables written as in (3.3) 
gives rise to the system of equations 


X1 + X9 = X3 + X2, 
(3.7) 
X2 + x1 = X4 + X3. 

As in the Toeplitz case, since there are exactly k distinct letters in the word, this 
is the system of k equations in 2k + 1 unknowns. We solve it for the variables that 
precede the first occurrence of a letter, leaving us with k undetermined variables 
oo Xaj ss Xay = X2k-1 that precede the second occurrence of each letter, and 
with the (k + 1)st undetermined variable x2,. We add to the system (3.6) one more 
equation: 


XQ = Xok. 


HANKEL, MARKOV, TOEPLITZ MATRICES 15 


As previously, we require that the dependent variables are in the interval 7 = 
[0, 1]. This determines a cross section of the cube Z k+l] in the remaining k + 1 
coordinates with the volume which we denote by py (w). 

Due to the additional constraint x2, = xo, this volume might be zero. For exam- 
ple, (3.7) has solutions xo = 2x2 — x4, x1 = x3 — x2 + x4 with undetermined vari- 
ables x2, x3, x4. Equation xo = x4 gives additional relation x4 = x2, and reduces 
the dimension of the solid {2x2 — x4 € I} N {x3 — x2 + x4 E IJO (x42 x2} c P 
to 2. Thus the corresponding volume is py (abab) = 0. 

We define measure yg as a symmetric measure with even moments 


(3.8) ma(yu)- >, pH(w). 
w: |w|=2k 
From Proposition 4.7 below it follows that (3.8) indeed defines a positive def- 
inite sequence of numbers so that these are indeed the even moments of a proba- 
bility measure. Since mz, is at most the number (2k — 1)!! of words of length 2k, 
these moments determine the limiting distribution yg uniquely. 


3.4. Relation to Eulerian numbers. The Eulerian numbers Ay,» are often de- 
fined by their generating function or by the combinatorial description as the num- 
ber of permutations o of (1,...,n) with o; > oj; for exactly m choices of 
i — 1,2,...,n (taking og = 0). The geometric interpretation is that A, »/n! is the 
volume of a solid cut out of the cube 7” by the set (x1 + --- -- xn e [m — 1, m] 
see [23]. Converting any m — 1 of the coordinates x to 1 — x, we get that Ay »/n! 
is the volume of a solid cut out of the cube 7" by the set 


[On e Xn) € R” :x1 + x2 + + Xn-m — (Xn-m+1 c: Xn) e I. 


The solids we encountered in the formula for the 2kth moments are the intersec- 
tions of solids of this latter form, with odd values of n, each having m — (n — 1)/2, 
and with various subsets of the coordinates entering the expression. 

Another interesting representation is 


Vol({@1, o Xn) EL" :xi d x2 d ob Xm — (Xn—m41 t d Xn) € Ij) 
2 [99 fsintA"*! 
= =| (=) cos((n + 1 — 2m)t) dt. 
T JO t 
This follows from the integral representation of Eulerian numbers in Nicolas [16]. 
REMARK 3.1. One can verify that the probabilities pr(w) and py(w) are 
rational numbers, and hence so are mox(yr) and mox(yg), defined by formulas 
(3.5) and (3.8) (for details, cf. [6]). 
4. Proofs of Theorems 1.1, 1.2 and 1.3. 


4.1. Truncation and centering. We first reduce Theorems 1.1, 1.2 and 1.3 to 
the case of bounded 1.1.d. random variables, and in case of Theorems 1.1 and 1.2, 


16 W. BRYC, A. DEMBO AND T. JIANG 
also allow for centering of these variables. 


PROPOSITION 4.1. (1) Jf Theorem 1.1 or Theorem 1.2 holds true for all 
bounded independent i.i.d. sequences (X ;} with mean zero and variance 1, then it 
holds true for all square-integrable i.i.d. sequences (X ;} with variance 1. 

(ii) If Theorem 1.3 holds true for all bounded independent i.i.d. collections 
{Xi;} with mean zero and variance 1, then it holds true for all square-integrable 
i.i.d. collections (Xij) with mean zero and variance 1. 


PROOF. Without loss of generality, we may assume that E(X;) = 0 in Theo- 
rems 1.1 and 1.2. Indeed, from the rank inequality ({1], Lemma 2.2) it follows that 
subtracting a rank-1 matrix of the means E(X1) from matrices T, and H, does not 
affect the asymptotic distribution of the eigenvalues. 

For a fixed u > 0, denote 


m(u) = EX1Ijx,|-u) 
and let | 
o? (u) = EX? Inx, icu) — m^ (u). 


Clearly, o?(u) < 1 and since E(X1) = 0, E(X?) = 1, we have m(u) — 0 and 
o(u)— lasu- oo. 
Let 


X1 = Xilqxysu) — m). 
Notice that o?(u) = E(X4 — X12, therefore the bounded random variable 
X|— X 1 
o (u) 





X = 


has mean zero and variance 1. Denote by T),, Hj, the corresponding Toeplitz and 


Hankel matrices constructed from the independent bounded random variables 


X*. — Xj 28 Xj 
' ou) 


distributed as X}. By the triangle inequality for dg; (., -) and (2.16), 
diy (A (Yn /A/n ), &(Tj / An )) 
< 2dsy (Â(Tn/ Vn), Ao Qu), / /n)) + 248 (A(T, n), (o Tp / In) 
s 2 
< (T, — o (u)T/ ^) + =e c (u))- tr((T’,)*). 
It is easy to verify that E(X2) = 1 — o? (u) — 2m(u)? and that with probability 1 


| 2 lg, 2¥ J\ ¥2 v2 
(4.1) -z (Tn — o (u)T,)^) = 4*0 ES P — Js -> EX), 


HANKEL, MARKOV, TOEPLITZ MATRICES 17 


as n — oo (e.g., sandwiching the coefficients j/n between the piecewise constant 
£71 |£j/n| and £7! [£j/n] allows for applying the strong law of large numbers, 
with the resulting nonrandom bounds converging to E(X 1) as £ — oo). Similarly, 


n 
42 ETD +t »(1- ales » ^ E(X1)^. 
n n ni n 
For large u, both m(u) and 1 — o (u) are arbitrarily small. So, in view of (4.1) 
and (4.2), with probability 1 the limiting distance in the bounded Lipschitz metric 
dpi, between ALT, / /n) and A(T, /./n) is arbitrarily small, for all u sufficiently 
large. Thus, if the conclusion of Theorem 1.1 holds true for all sequences of inde- 
pendent bounded random variables (X j^ with the same limiting distribution yr, 


then (T, / /n ) must have the same weak limit with probability 1. 
similarly, we have 


dg (£a / /n), A (Hr, / /n.)) 
Z (B, — o(u)H’)?) + E — o Gy (H^). 


By the same argument as before, with probability 1 


2n 


> tr((Ha — o (u)H ac = -Y(1- Lo") 37 > oh ), 


j=0 





and n~*tr((Hi,)*) — E((xX})*). Therefore, with probability 1 the limiting 
dg;,-distance between (i(H,,/./n) and (H7 / n) is arbitrarily small for large 
enough u. x 

Similarly, denoting by Mn, M; the corresponding Markov matrices constructed 


from the independent bounded random variables Xi j and X;, = = a, we have 


dB (Ms / An), fi(Mi,/Vn)) < 5 = tr) Pa (1 — eG) tM). 


By (2.18), with probability 1, n^? - Y?) > 2 and n? (M2) > 2E(X2,). 
Therefore, with probability 1, the limiting dgy-distance between fi(M,,/./n) and 
(M, /./n) is arbitrarily small for large enough u. O 


4.2. Combinatorics for Hankel and Toeplitz cases. For k,n € N, consider 
circuits in {1,..., n} of length L(z) =k, that is, mappings vr :(0,1,...,k) —> 
(1,2, ..., n), such that yr (0) = x (k). 

Let s: N? — N be one of the following two functions: sr (x, y) = |x — yl, or 
S(x, y) — x + y. We will use s to match (i.e., pair) the edges (z (i — 1), 7r (i)) of 
a circuit zr. The main property of the symmetric function s is that for a fixed value 


| .18 W. BRYC, A. DEMBO AND T. JIANG 


of s(m, n), every initial point m of an edge determines uniquely a finite number 
(here, at most 2) of the other end-points: if k, m € N, then 


(4.3) #{y e N:s(m, y) =k} x 2. 


For a fixed s as above, we will say that circuit a is s-matched, or has self- 
matched edges, if for every 1 <i < L(x) there is j Æ i such that s(zr(i — 
1), w(i)) — s(xG — 1), x(J)). 

We will say that a circuit m has an edge of order 3, if there are at least three 
different edges in 7 with the same s-value. 

The following proposition says that generically self-matched circuits have only 
pair-matches. 


PROPOSITION 4.2. Fixr €N. Let N denote the number of s-matched circuits 
in {1,...,n} of length r with at least one edge of order 3. Then there is a constant 
C, such that 


N< C,nl€ * 07221 | 


In particular, as n — oo we have EA => 0. 


PROOF. Either r = 2k is an even number, or r = 2k — 1 is an odd number. In 
both cases, if an s-matched circuit has an edge of order 3, then the total number of 
distinct s-values 


{s(x — D,26)):1xixLG)) 


is at most k — 1. We can think of constructing each such circuit from the left to the 
right. First, we choose the locations for the s-matches along {1,...,r}. This can 
be done in at most r! ways. Once these locations are fixed, we proceed along the 
circuit. There are n possible choices for the initial point zr (0). There are at most n 
choices for each new s-value, and there are at most two ways to complete the edge 
for each repeat of the already encountered s-value. Therefore there are at most 
rixnxnk-lgrti-k < C.n* such circuits. O 


We say that a set of circuits 71, 72, 715, zt4 1s matched if each edge of any one 
of these circuits is either self-matched, that is, there is another edge of the same 
circuit with equal s-value, or is cross-matched, that is, there is an edge of the other 
circuit with the same s-value (or both). 

The following bound will be used to prove almost sure convergence of moments. 


PROPOSITION 4.3. Fix r eN. Let N denote the number of matched quadru- 
ples of circuits in {1,...,n} of length r such that none of them is self-matched. 
Then there is a constant C, such that 


N < Cn” t., 2 


HANKEL, MARKOV, TOEPLITZ MATRICES 19 


PROOF. First observe that there are at most 2r distinct s-values in the 4r edges 
of matched quadruples of circuits of length r. Further, the number of quadruples 
of such circuits for which there are exactly u distinct s-values is at most C, n" *^. 
Indeed, order the edges Gr; G — 1), 77; (7)), of such quadruples starting at j = 1, 
i = 1, then i —2,...,r, followed by j —2,i = 1 and then i —2,...,r, and so 
on. There are at most u^" possible allocations of the distinct s-values to these 4r 
edges, at most n^ choices for the starting points 71(0), x2(0), z3(0) and 24(0) 
of the circuits and at most n" for the values of z;(i) at those (j,i) for which 
(x; (i — 1), 7; (i)) is the leftmost occurrence of one of the distinct s-values. Once 
these choices are made, we proceed to sequentially determine the mapping 7 (i) 
from i = 0 to i =r, followed by the mappings 72, 73, 74, noting that by (4.3) at 
most 2*'^*-^ quadruples can be produced per such choice. 

Recall that the number of possible partitions P of the 4r edges of our quadruple 
of circuits into | P| distinct groups of s-matching edges, with at least two edges in 
each group, is independent of n. Thus, by the preceding bound it suffices to show 
that for each partition P with |P] € (2r — 1, 2r) such that each circuit shares 
at least one s-value with some other circuit, there correspond at most Cn” +? 
matched quadruples of circuits in (1,..., 2). To this end, note that |P | = 2r im- 
plies that each s-value is shared by exactly two edges, while when |P| = 2r — 1 
we also have either two s-values shared by three edges each or one s-value shared 
by four edges (but not both). 

Fixing hereafter a specific partition P of this type, it is not hard to check that 
upon re-ordering our four circuits we have an s-value that is assigned to exactly one 
edge of the circuit x1, denoted hereafter (7t (i, — 1), 71 (1,)), and in case |P| = 2r, 
we also have another s-value that does not appear in 77; and is assigned to exactly 
one edge of 72, denoted hereafter (zt5 (  — 1), 72 (j.)). (Though this property may 
not hold for all ordering of the four circuits, an inspection of all possible graphs of 
cross-matches shows that it must hold for some order.) 

We are now ready to improve our counting bound for the case of | P| = 2r — 1, 
by the following dynamic construction of 7: 

First choose one of the n possible values for the initial value z (0), and continue 
filling in the values of x; (i), i = 1,2, ..., i, — 1. Then, starting at zı (r) = x1 (0), 
sequentially choose the values of xı (r — 1), mı (r — 2), ..., 7:1 (i*), thus complet- 
ing the entire circuit x1. This is done in accordance with the s-matches determined 
by P, so there are n ways to complete an edge that has no s-match among the 
edges already constructed, while by (4.3) if an edge is matching one of the edges 
already available, then it can be completed in at most two ways. Since this pro- 
cedure determines uniquely the edge (74 (ix — 1), z1(1,)) and hence the s-value 
assigned to it, it reduces to 2r — 2 the number of s-matches that can each indepen- 
dently assume O(n) values. Consequently, the number of quadruples of circuits 
corresponding to P is at most Cn?” t+., 

In case |P| = 2r, we first construct 2; by the preceding dynamic construction 
while determining the s-value for the edge (7; (i, — 1), x1 (i+)) out of the circuit 


20 W. BRYC, A. DEMBO AND T. JIANG 


condition for 71. Then, we repeat the dynamic construction for 22, keeping it in 
accordance with the s-values determined already by edges of 7 and uniquely 
determining the edge (7r2(jx — 1), 72(j+)) and hence the s-value assigned to it, 
by the circuit condition for 22. Thus, we have again reduced the total number of 

s-matches that can each independently assume O(n) values to 2r — 2, and conse- 
quently, the number of quadruples of circuits corresponding to P is again at most 
C n?r*3. g 


The next result deals only with the slope matching function sr (x, y) = |x — y]. 


PROPOSITION 4.4. Fix k € N. Let N be the number of sr -matched circuits 
x in (1,...,n) of length 2k with at least one pair of sr-matched edges (zt (i — 
1), x (i) and (c (j —1),n(j)) such that x (i) — x (i — 1) --x(j) — " — 1) zz O. 
Then, as n — oo we have 


n &*DN —> 0. 


PROOF. By Proposition 4.2, we may and shall consider throughout path 7r in 
(1,..., n) of length 2k for which the absolute values of the slopes zr (i) — zr (i — 1) 
take exactly k distinct nonzero values and, for x to be a circuit, the sum of all 
2k slopes is zero. Let P denote a partition of the 2k slopes to sr-matching pairs, 
indicating also whether each slope is negative or positive, with m(P ) denoting the 
number of such pairs for which both slopes are positive. Observe that if under P 
both slopes of some sr -matching pair are negative, then necessarily m (4?) = 1, for 
otherwise the sum of all slopes will not be zero for any path corresponding to P 
Thus, it suffices to show that at most n* circuits 2 correspond to each P with 
m= m(P) > 1. Indeed, fixing such P, there are at most n ways to choose 7 (0) 
and n*-" ways to choose the k — m pairs of slopes for which at least one slope 
in each pair is negative. The remaining m pairs of sr-matching positive slopes 


are to be chosen among (1,...,7) subject to a specified sum (due to the circuit 
condition). Since there are at most pn-l ways for doing so, the proof is complete. 
[] 


4.3. Moments of the average spectral measure. 


PROPOSITION 4.5. Suppose (X ;} is a sequence of bounded i.i.d. random vari- 
ables such that E(X4) = 0, E(X?) = 1. Then fork € N 


_, | 2k 
(4.4) Jim SpE) = 5, priv) 
w: |wl|=2k 


and 


—— rE tr(T*-!) = 


1 
(4.5) m E172 nk+1/2 


HANKEL, MARKOV, TOEPLITZ MATRICES 21 


PROOF. Foracircuit x : (0, 1, ...,r) — (1,2,..., n) write 


r 
(4.6) Xr = II X n()-x(-—1)- 
i==] 
Then 
(4.7) Etr(T7) = )  EX;, 
Tm 
where the sum is over all circuits in (1,..., 71) of length r. 


By Hólder's inequality, for any finite set II of circuits of length r 


X EX; 


rell 


(4.8) < E(X| 8I. 








Since |X|" is bounded, we can use the bound (4.8) to discard the "nongeneric" 
circuits from the sum in (4.7). To this end, note that since the random variables 
(X j) are independent and have mean zero, the term EX; vanishes for every circuit 
x with at least one unpaired X ;. Since T, is a symmetric matrix, by (4.6) paired 
variables correspond to the slopes of the circuit x which are equal in absolute 
value. Hence, the only circuits that make a nonzero contribution to (4.7) are those 
with matched absolute values of the slopes. This fits the formalism of Section 4.2 
with the matching function sr (x, y) = |x — yj. 

If r = 2k — 1 > 0 is odd, then each s7r-matched circuit x of length r must have 
an edge of order 3. From (4.8) and Proposition 4.2 we get |Etr(T2*—!)| < Cn*, 
proving (4.5). 

When r = 2k is an even number, let II be the set of all circuits x : (0, 1,..., 
2k) — (1,..., n) with the set of slopes {xr (i) — zr (i — 1):i = 1,...,2k] consisting 
of k distinct nonnegative integers 5;,...,5, and their counterparts —51,..., —Sx. 
From (4.8) and Proposition 4.4 it follows that 


"UM 
lim -gyr Etr(T;) — 5 | EX«| 0. 


n-—0oo 
z eil 


Moreover, for every circuit x € IT, if X; enters the product X; , then it occurs in it 
exactly twice, resulting with EX, = 1, and consequently with ?^, er EX, = #0]. 
Therefore, the following lemma completes the proof of (4.4), and with it, that of 
Proposition 4.5. OU 


LEMMA 4.6. 


"E 
Burt MAD, 


where the sum is over the finite set of partition words w of length 2k. 


P 30824 


22 W. BRYC, A. DEMBO AND T. JIANG 


PROOF. The circuits in I can be labeled by the partition words w of length 2k 
which list the positions of the pairs of sr-matches along (1, ..., 2k}. This generates 
the partition TI = |J „ I1(w) into the corresponding equivalence classes. 

To every such partition word w we can assign a paths x(i) = xi, i = 
0,..., 2k, obtained by solving the system of equations (3.2), with values 1,2, ..., 5 
for each of the k 4- 1 undetermined variables, and the remaining k values computed 
from the equations [which represent the relevant sr-matches for any x € II(w)]. 
Some of these paths will fail to be in the admissible range {1,...,}. Let p,(w) 
be the fraction of the n*+! paths that stay within the admissible range {1,..., n], 
noting that by Proposition 4.2, p, (w) — n- ** DstTI(w) > 0. 

Interpreting the undetermined variables x; as the discrete uniform independent 
random variables with values (1,2,...,7], p, (w) becomes the probability that 
the computed values stay within the prescribed range. As n — oco, the k + 1 unde- 
termined variables x;/n converge in law to independent uniform U[0, 1] random 
variables U;. Since p(w) is the probability of the (independent of n) event Ay 
that the solution of (3.2) starting with x;/n € (1/n, Z/n,..., 1} has all the depen- 
dent variables in (0, 1], it follows that p(w) converges to pr (w), the probability 
of the event Ay that the corresponding sums of independent uniform U[0, 1] ran- 
dom variables take their values in the interval [O, 1]. jJ 


Next we give the Hankel version of Proposition 4.5. 


PROPOSITION 4.7. Let {X ;} bea sequence of bounded i.i.d. random variables 
such that E(X1) = 0, E(X?) = 1. Fork EN, 


i: 1 
(4.9) jim, EG) = J, paw) 
w: |wl=2k 
and 
: 1 2k—1 


PROOF. We mimic the procedure for the Toeplitz case. For a circuit v : (0, 1, 
...,r}— 0,2, ..., n} write 


" 
(4.11) Xr = I] Xx(i)+n(i-1): 


i=] 


As previously, 
(4.12) Etr(H7) — 9 EX,, 
Jt 


where the sum is over all circuits in (1, ..., n} of length r, and by Hólder's inequal- 
ity, we again have the bound (4.8), which for bounded |X|’ we use to discard the 


' è 
4 » : 
` 


HANKEL, MARKOV, TOEPLITZ MATRICES 23 


“nongeneric” circuits from the sum in (4.12). To this end, with. the random vari- 
ables X; independent and of mean zero, the term EEX, vanishes for every circuit 
x with at least one unpaired X ;. By (4.11), in the current setting paired variables 
correspond to an sj matching in the circuit 7. Hence, only sy-matched circuits 
(in the formalism of Section 4.2) can make a nonzero contribution to (4.12). 

If r = 2k — 1 > 0 is odd, then each sg -matched circuit x of length r must have 
an edge of order 3. From (4.8) and Proposition 4.2 we get |E tr(H2*-.J < Cn*, 
proving (4.10). 

When r = 2k is an even number, let II be the set of all circuits x : (0, 1,..., 
2k) — (1,..., n} with the sg-values consisting of k distinct numbers. Recall that 
EX, = 1 for any x € II [see (4.11)]. Further, with any sg matched circuit not 
in TI having an edge of order 3, it follows from (4.8) and Proposition 4.2 that 

um -ap Etr) — #1 = 0. 
Therefore, the following lemma completes the proof of (4.9), and with it, that of 
Proposition 4.7. (1 


LEMMA 4.8. 





HI— 5, pmo). 
w: |wj=2k 


lim 
n> OO} nktl 


PROOF. Similarly to the proof of Lemma 4.6, label the circuits in II by 
the partition words w which list the positions of the pairs of sy-matches along 
{1,...,2k}, with the corresponding partition I1 = LJ, II(w) into equivalence 
classes. To every such partition word w we can assign n*t! paths m(i) = xj, 
i = 0,...,2k, obtained by solving the system of equations (3.6), with values 
1,2, ...,n for each of the k + 1 undetermined variables, and the remaining k val- 
ues computed from the equations. Some of these paths will fail to be a circuit, 
and some will fail to stay in the admissible range {1,...,n}. Let p,(w) denote 
the fraction of the paths that stay within the admissible range (1,...,7) and are 
circuits, noting that p, (w) — n^ &* DsTI (w) — 0 by Proposition 4.2. Thus, p, (w) 
is the probability of the event A,, that the solution of (3.6) starting with the unde- 
termined variables x; that are independent discrete uniform random variables on 
the set {1/n,2/n,..., 1}, stays within (0, 1] and satisfies the additional condition 
xo = X2k. It follows that as n — oo, the probabilities p, Cw) converge to pu (w), 
the probability of the event A, with the undetermined variables now being inde- 
pendent and uniformly distributed on [0, 1]. 1] 


4.4. Concentration of moments of tlie spectral measure. 


PROPOSITION 4.9. Let {X ;} be a sequence of bounded i.i.d. random variables 
such that E(X1) = 0 and E(X?) = 1. Fix r € N. Then there is C, < oo such that 


24 W. BRYC, A. DEMBO AND T. JIANG 


for all n € N we have 


E[(t(T) — Etr(¥’,))*] < Crn”? and E[(tr(H7) — Etr(H^))^] < Cn” t. 


PROOF. The argument again relies on the enumeration of paths. Since both 
proofs are very similar, we analyze only the Hankel case. 
Using the circuit notation of (4.11) we have that 


(413)  E[(rdf)—Et(HD)]- Y. =| [Te -E (xs) 


701,202 ,203 ,704 


where the sum is taken over all circuits 7:;, j — 1,...,4 on (1,..., n} of length r 
each. With the random variables X ; independent and of mean zero, any circuit zr; 
which is not matched together with the remaining three circuits has IE(X5,) = 0 
and 


4 
T] (Xz, — E) -E X. | [(Xs; - (x) — 0. 
j=l jzk 

Further, if one of the circuits, say 71, is only self-matched, that is, has no cross- 
matched edge, then obviously 


el (Xz; — E(Xx,) |- E[X4, — sir [1 E x) m 


Therefore, it suffices to take the sum in (4.13) over all sz-matched quadruples of 
circuits on (1,..., n}, such that none of them is self-matched. By Proposition 4.3, 
there are at most C,.n?'*? such quadruples of circuits, and with |X| (hence |X; |) 
bounded, this completes the proof. L1] 


4.5. Proofs of the Hankel and Toeplitz cases. 


PROOF OF THEOREM 1.1. Proposition 4.1(i) implies that without loss of gen- 
erality we may assume that the random variables [X ;} are centered and bounded. 

By Proposition 4.5 the odd moments of the average measure E(ji(T,/./n)) 
converge to 0, and the even moments converge to m2, of (3.5). By Chebyshev's 
inequality we have from Proposition 4.9 that for any ô > O and k,n € N, 


p|| f xt da(tn/va) — f x GI) 


Thus, by the Borel-Cantelli lemma, with probability 1 f x* dA (T, / An) > 
f x* dyr as n — oo, for every k € N. In particular, with probability 1, the random 
measures (L(T,/./n)) are tight, and since the. moments determine yr uniquely, 
we have the weak convergence of £L(T, / /n ) to yr. 








> J « C8 n>. 


HANKEL, MARKOV, TOEPLITZ MATRICES 25 


Since the moments do not depend on the distribution of the i.1.d. sequence [X ;}, 
the limiting distribution yr does not depend on the distribution of X either, and is 
symmetric as all its odd momerits are zero. By Proposition A.1, it has unbounded 
support. Lj 


PROOF OF THEOREM 1.2. We follow the same line of reasoning as in the 
proof of Theorem 1.1, starting by assuming without loss of generality that [X ;) isa | 
sequence of centered and bounded random variables, in view of Proposition 4.1(1). 
Then, by Proposition 4.7, as n — oo the odd moments of the average measure 
E((H,,/./n )) converge to 0, and the even moments converge to m», of (3.8), 
whereas from Proposition 4.9 we conclude that with probability 1 the same applies 
to the moments of A (H, / /n ). The almost sure convergence f x* dfi(Hn//n) ^ 
f x* dyg as n — oo, for all k € N, implies tightness of f, (H, / /n) and its weak 
convergence to the nonrandom measure yy. Since its moments do not depend on 
the distribution of the i.i.d. sequence (X jJ, so does the limiting distribution yg, 
which is symmetric since all its odd moments are zero. By Proposition A.2 it has 
unbounded support, and is not unimodal. [I 


4.6. Markov matrices with centered entries. In view of Proposition 4.1(ii) we 
may and shall assume hereafter without loss of generality that the random variables 
Xj; are bounded. Our proof of Theorem 1.3 follows a similar outline as that used 
in proving Theorems 1.1 and 1.2, where the combinatorial arguments used here 
rely on matrix decomposition. 

Starting with some notation we shall use throughout the proof, let T, be a graph 
whose vertices are two-element subsets of {1,...,} with the edges between ver- 
tices a and b if the sets overlap, a O b ~ Ø. We indicate that (a, b) is an edge of Fn 
by writing a ~ b, and fora € T, leta = {a~, at} with 1 xa^ <a™ <n. 

The main tool in the Markov case is the following decomposition: 


M, = > XaQa,a; 
acr, 

where Xa :— X44 a- and Qj» is the n x n matrix defined for vertices a, b of I, 
by 

—]1, ifi =at, j =bt, ori =a", j =b7, 

Qali, J] — 1 1, ifi =at, j =b7, ori =a”, j =bt, 

0, otherwise. 

Let t; 5 = tr(Qa, b). It is straightforward to check that 
—2, ifa=b, 
— —l ifaAbanda =b" orat —b*, 
? 1, ifa" =bt orat =b7, 


0, otherwise. 


26 W. BRYC, A. DEMBO AND T. JIANG 


From this, we see that tg, = tp,q. Since it is easy to check that Q,, x Q; 
tb,cQa,d, we get 


MÀ eee MÀ ÓÁÓÀ À ——A a the — e — M 
~~ 


: 
(4.14) tr(Q,, «, X Qo X» X Qu,,a,) — I] la; ,aj+1 , 
j=l 


where for convenience we identified a41 with a1. 
For a circuit zz = (a1 ~--~ ar ^ a1) of length r in I’, let 


r r 
(4.15) s Xx = I] la; ,aj41 I Xa;. 


It follows from (4.14) and (4.15) that 
(4.16) tr(M7) = 5 Xs, 
X 


POP 


where the sum is over all circuits of length r in I',, leading to the Markov analog 
of the path expansion (4.7), 


(4.17) Etr(M^) =)“ EX,. 
3T 


We say that a circuit zz = (aj ~ --- ^ ap ~ a1) of length r in TI, is vertex- 
matched if for each i = 1,...,r there exists some j x i such that a; = aj, and 
that it has a match of order 3 if some value is repeated at least three times among 
(aj, j — l,...,r). Note that the only nonvanishing terms in (4.17) come from 
vertex-matched circuits. 

In analogy with Proposition 4.2, we show next that generically vertex- 
matched circuits have only double repeats, and consequently, the odd moments 
of IEG.(M, / /n ) converge to zero as n — oo. 


PROPOSITION 4.10. Fix r eN. Let N denote the number of vertex-maiched 
circuits in, with r vertices which have at least one match of order 3. Then there 
is a constant C, such that for all n € N 


N< C,nl**0/21. 


PROOF. Either r = 2k is even, or r = 2k — 1 is odd. In both cases, the total 
number of oe vertices per path is at most k — 1. Since aj ~ a2 ^ ---~a,, 
there are at most n a 2 choices for a1, and then at most 4n choices for each of the 
Teu k — 2 distinct values of a;, and one choice for each repeated value. Thus 
N «4n? x n? = Cn*. O 


COROLLARY 4.11. i aid (Xii; J 2 Ld = 1} are bounded i.i.d. random vari- 
ables such that E(X5) = 0, E(X n= = 1.7, rhen, 


1 
(4.18) im. xu -ra EM h = 


HANKEL, MARKOV, TOEPLITZ MATRICES 27 


.PROOF. If EX, is nonzero, then all the vertices of the path aj ^ a2 ~ ~ 
a2%—1 must be repeated at least twice. So for an odd number of vertices, there must 
be a vertex which is repeated at least three times. Thus, by Proposition 4.10 and 
the boundedness of |X;;| and of ta,b, 


[IE tr(M2*.)| < Cyn*, 
and (4.18) follows. QI 


Let W, = n!" Zn +X, + EI, where X, is a symmetric n x n matrix with 
ii.d. standard normal random variables (except for the symmetry constraint), 
Zn = diag(Zii)1«i«n, with i.i.d. standard normal variables Z;; that are indepen- 
dent of X, and & is a standard normal, independent of all other variables. A direct 
combinatorial evaluation of the even moments of Eij(M,,/./n ) is provided in [6]. 
We follow here an alternative, shorter proof, proposed to us by O. Zeitouni. The 
key step, provided by our next lemma, replaces the even moments by those of the 
better understood matrix ensemble W,. 


LEMMA 4.12. Suppose (Xij; j =i = 1} isa collection of bounded i.i.d. ran- 
dom variables such that E(X 15) — 0, E(X $2) — ]. Then, for every k € N, 


(4.19) im nPE rM") — Etr(W7)] = 0. 


PROOF. Firstobserve that by Proposition 4.10, we may and shall assume with- 
out loss of generality that {X;;} is a collection of 1.i.d. standard normal random 
variables, subject to the symmetry constraint X;; = X jj [as such a change affects 
n *tDE tr(M2*) by at most Cyn]. Recall the representation M, = X, — D, 
of (1.3) and let M, =X, — dD”, where D». is obtained by omitting the last row 
‘and column of the diagonal matrix D, j which is an independent copy of D, 
that is independent of X,. Observe that the diagonal entries of -Dw. are jointly 
normal, of zero mean, variance n + 1 and such that the covariance of each pair is 1. 
Therefore, with =D% independent of X,, for each n, the distribution of M, is 
exactly the same as that of W,,. Consequently, (4.19) is equivalent to 


xe. Cors d) 2k. — ena 
(4.20) lim n E[tr(M2*) — tr(M25] — 0. 


The first step in proving (4.20) is to note that by a path expansion similar to (4.17) 
we have that 


(4.21) . E[tr(M7^) — tr(M2^)] = Y EM; — EM; ], 
H 
where now the sum is over all circuits x : (0,..., 2k] —> (1,..., n), and 
2k 


M; = | | Mra- 


i=I 


^ 


28 W. BRYC, A. DEMBO AND T. JIANG 


with the corresponding expression for M,,. Set each word w of length 2k to be a 
circuit by assigning w[0] = w[2k] and let II(w) denote the collection of circuits 
7z such that the distinct letters of w are in a one-to-one correspondence with the 
distinct values of x. Let v = v(w) be the number of distinct letters in the word w; 
noting that 4II(w) < n") and that EM, — EM, = f,(w) is independent of the 
specific choice of x € II(w). Hence, taking the letters of w to be from the set 
. of numbers [1,2,..., 2k) with the convention that w(i) = w[i], we identify w as 
a representative of x € II(w) (recall w[0] = w[2k]). For example, w = abbc of 
v(w) — 3 distinct letters becomes w — 1223 which we identify with the circuit 
x € II(w) of length 4 consisting of the edges (1,2), (2, 2), {2,3} and (3, 1). In 
view of (4.21), we thus establish (4.20) by showing that for any w, some Cy < oo 
and all n, 


(4.22) |f (w)| = |EMy — EM, < Cynt 0*1. 


Let q = q(w) be the number of indices 1 <i < 2k for which w[i] = w[i — 1] 
[e.g., q(1223) = 1]. It is clear from the definition of M, and M, that f,(w) 40 
only if g(w) > 1. Let u = u(w) count the number of edges of distinct endpoints 
in w, namely, with {w[i — 1], w[i]] € Ca, which appear exactly once along the 
circuit w [e.g., u(1223) = 3]. Then, by independence and centering we have that 
EM,„ = 0 as soon as u(w) > 1, whereas it is not hard to check that if u(w) > q(w), 
then also EM,, = 0. Thus, it suffices to consider in (4.22) only circuits w with 
q(w) > u(w). 

It is not hard to check that excluding the g loop-edges (each connecting some 
vertex to itself), there are at most k + | (u — 9)/2] distinct edges in w. These 
distinct edges form a connected path through v(w) vertices, which for u > 1 must 
also be a circuit. Consequently, for any of the words w we are to consider, 


(4.23) v(w) X k + 14(55-0 + | (ulw) — 4(w))/2] < k. 


Proceeding to bound | f; (w)|, note that any contribution which grows with n 
must come from the q diagonal entries of Mn and M, which are encountered 
according to the circuit w. Suppose first that u > 1, in which case f,(w) = EM. 
Computing the latter, upon expanding the sums in the g relevant diagonal entries of 
D, = diag( ^ j=1 Xij), we must assign specific choices to at least u of the resulting 
"free" indices j1,..., jg € (1, ..., n) in order to match all u unmatched edges of 
w of the form {w[i — 1], w[il) € T, . Indeed, by independence and centering, every 
other term of this expansion has zero expectation. After doing so, as each diagonal 
entry of D, is normal of mean zero and variance n, we conclude by Hólder's 
inequality that | f,(w)| < Cyn? —u)/2 BY: our bound (4.23) on v(w), this implies 
that (4.22) holds. 

Consider next words w for which "S = 0 and let a, ...,a4 be , the q "nra 
for which (ai, aj) is an edge of the circuit w. Let Mi; = Qi — -5 and M; = = Qi — Si, 


fori = L, okay 2k, where Qi = Xii — $a Xij, Q; = Xii E ONE -YX Xij and 


HANKEL, MARKOV, TOEPLITZ MATRICES 29 


S; — 3 7 i-2k 41 X; j with the corresponding expressions for S;. Note that we may 
and shall replace each 5; by S; without altering EM, and since the off-diagonal 
entries of M, and M,, are the same, we have that 


faw) = EL. | TI. - $4) TTG. = a) 


i=] i=] 


i—i q 
zi SE Lol Qe a Qa) I] Maj,aj II Paya | 
; jel 


i=l j=i+1 


where La is the product of the (2k — q) off-diagonal entries of M, that correspond 
to the edges of w that are in Tn. Since the distribution of (Ly, [Qi], (Qi) is 
independent of n > 2k, while Mj; and Mii are normal of mean zero and variance 
at most n + 2, it follows by Hólder's inequality that | f,(w)| < Cyn@@-D/, 
which by (4.23) results with (4.22). 

As already seen, (4.22) implies that (4.20) holds and hence the proof of the 
lemma is complete. LI 


Let yo(dx) — gx y 4 — x^1jx|«2 denote the semicircle distribution, yı (dx) = 
"S exp(—x? /2) denote the standard normal distribution and let yy = yo Hy, be 


the corresponding free convolution. In view of Lemma 4.12, our next result shows 
that the even moments of ERL(M, / /n) converge as n — oo to those of yy. 


PROPOSITION 4.13. For every k € N, 


(4.24) (00 Jim nD Wk) = / x dyy. 


PROOF. Let An = Zn +n7/*EI,, so nW, = A, +n7'/2X,. By the 
strong law of large numbers, with probability 1, Z(A,) — yı weakly. Further, 
sup, E f |x| d (As) < oo, and E f |x|dfin/2X,) < n^! /Etr(X2) = 1, imply- 
ing by Pastur and Vasilchuk ([17], Theorem 2.1 and page 280), that (W,,/./n) 
converges weakly to ym, in probability. It follows that for any k € N and all r < co, 


(4.25) Jim E f mœ dW 8) = [00 dy, 


where A, (x) = (min(lIxi, r))"*. Recall that all moments of ym are finite (cf. Propo- 
sition A.3), so as r — oo the right-hand side of (4.25) converges to f x?* dyy. It 


is not hard to check that for any k € N, 


E J x* dA (W,/ fn) =n HDE tr(W%) 


30 W. BRYC, A. DEMBO AND T. JIANG 
is bounded in n by some C, < oo. Hence, for all n, 

nw er WIE) —E | 5G) 48(W] V) < Cerir, 
and (4.24) follows by considering r — co in (4.25). [1 


We next derive the analog of Proposition 4.3 and similarly to Proposition 4.9, 
get as a result the concentration of moments of (M, /./n) around those of 


E(QM, / Jn )). 


PROPOSITION 4.14. Fix r eN. Let N denote the number of vertex-matched 
quadruples of circuits in V, with r vertices each, such that none of them is self- 
matched. Then there is a constant C, such that 


N < Cnet? 


PROOF. Let P denote the partition of the 4r vertices of the circuits 714, .. . , 74 
in IL, to |P| « 2r distinct groups of matching vertices, with at least two elements 
in each group, while having each circuit cross-matched to at least one of the other 
circuits. As part of P we specify also which of the four types of edges to use in 
each connection along the circuits. For i = 1, 2, 3, 4, let u; = uj (2?) be the number 
of distinct vertices in x; that do not appear in any mj, j « i. There are at most 
nl*"! ways to choose the circuit m; in agreement with P, that is, n”/2 ways to 
choose the vertex a, of xı and at most n ways for each of the remaining u1 — 1 
distinct vertices of 7x1. For i = 2, 3,4, per given xj, j <i, the same procedure 
shows that there.are at most ntt”: ways to complete the circuit zz;. Further, if 77; 1s 
cross-matched to x; for some j < i, then starting the completion of 7; at a vertex 
that we already determined by such a cross-match, we have that there are only n” 
ways to complete 2;. The latter improved bound always applies for i = 4, and it 
is not hard to check that upon re-ordering the four circuits, we can assure that it 
applies also for i = 3. We thus get at most n**? quadruples of circuits per choice 
of P, where u = »; uj = || x 2r, yielding the stated bound. O 


PROPOSITION 4.15. Suppose (Xij; j >i = 1} is a collection of bounded i.i.d. 
random variables such that (X15) = 0 and E(X?) = 1. For any r € N, there 
exists C, < oo such that E[(tr(M’,) — Etr(M7))4] < C,n*"*? for all n € N. 


PROOF. By (4.16) we have the Markov analog of (4.13) 


4 ; 
(426) E[(t(M]) -Et(MD)]- Y: s TIO - E) | 


JE 1,772,703 ,704 j=l 


where the sum is taken over all circuits 7;, j = 1,..., 4, in Ta, each having r 
vertices. With the random variables (Xij; n > j =i = 1} independent and of mean 


HANKEL, MARKOV, TOEPLITZ MATRICES 3] 


zero, just like the proof of Proposition 4.9, it suffices to take the sum in (4.26) 
over all vertex-matched quadruples of circuits on T',, such that none of them is 
self-matched. Since |X] (and hence |X,,|) is bounded the stated inequality follows 
from the bound of Proposition 4.14 on the number of such quadruples. L 


PROOF OF THEOREM 1.3. The proof is very similar to that of Theorems 1.1 
and 1.2, where by Proposition 4.1(i, we may and shall assume that (Xij; J > 
i > 1} is a collection of i.i.d. bounded random variables. Then, by (4.18) the odd 
moments of the average measure E(ji(M,/./n)) converge to 0, and by Propo- 
sition 4.13 the even moments converge to those of yy, whereas from Proposi- 
tion 4.15 we conclude that with probability 1 the same applies to the moments of 
(LM, /./n). By Proposition A.3, ym is a symmetric measure of bounded smooth 
density that, though of unbounded support, is uniquely determined by its mo- 
ments (having in particular zero odd moments). Hence, the almost sure conver- 
gence f x* dji(M,/./n) — f x* dyy as n — oo, for all k € N, implies the weak 
convergence of (M, /A/n) to yy. O 


APPENDIX 


A.1. Properties of yy, yy and yr. In this section we establish properties of 
the symmetric measures with moments given by (3.5), and (3.8) and the free con- 
volution ym of Theorem 1.3. For proofs, it is convenient to express the volumes 
pu (w) and pr (w) as the probabilities that involve sums of independent uniform 
random variables. This can be done by setting the undetermined variables as the 
independent uniform U[0, 1] random variables Uo, U1, ..., Ug, expressing the de- 
pendent variables as the linear combinations of Uo, U1,..., Ug, and expressing the 
volumes as the probabilities that these linear combinations are in the interval J. For 
each partition word w of length 2k with a nonzero volume p(w), this probability 
takes the form 


k M 
(A.1) p(w) =°( nuo; € [0, n), 


i=] U j=0 


where n; ; are integers and M = k. 


PROPOSITION A.1. A symmetric measure yr with even moments given 
by (3.5) has unbounded support. 


PROOF. It suffices to show that (m,)!/* — oo. Let w be a partition word of 
length 2k. Denoting $; = 25j ni jUj — 5, i=1,2,...,k, we have 


k 
(A.2) pro - (Pus < s) 


i=] 


32 W. BRYC, A. DEMBO AND T. JIANG 


Since the coefficients nj, ; in (A.1) take values 0, +1 only, and 5; jni j= l each 
of the sums $; in (A.2) has the following form: 


' L 
(A.3) S = (Ug — 1/2) + 9 (Ugt) — Uyo), 
j=l 


where a, B(J), y (J), J =1,..., L, are all different. Let L; denote the number of 
independent random variables U in this representation for S;. Clearly, 1 < L; < 


| k+l. 


Fixing € > 0 let U; = 1/2 + V;/(e(k + 1)) for j =0,...,k. For k > 1/e define 
the event 


k 
A= (MU; - 121 ——— P 
ni d Bie) 


noting that conditionally on A, the random variables Vo,..., Vy are indepen- 
dent, each uniformly distributed on [—1/2, 1/2]. As under this conditioning the 
ii.d. random variables (Vj) have symmetric laws, it is easy to check that for 
iz=],...,k, the form (A.3) of S; implies that 


Li 
> Vi 


j=l 


Li 
> e(k + 0/2) = æ(F V; >e(k+ 0/2). 


j=l 





P(IS;| > 1A) =P( 





which by Markov’s inequality is bounded above by 
2 —e/2. Li 
26-8 CD qgesV )Li e Pena (** — € ü ) 


E 





Since £52" < e'^/? for x > 0, and L; < k + 1, we deduce that 


(A4) P(Si» 11A) < 2exp(—e?(k + 1)/2 + e? Li /4) < 2^7 HDA, 


for i = 1,..., k. As2ke-* **D/^ < 1/2 for some ko = ko(&) < oo and all k > ko, 
it follows from (A.2) and (A.4) that for all k > ko and any word w of length 2k, 


(A.5) pr(w) > P(A) = 1e(k + DEP. 


Since there are more than k! partition words w of length 2k, this shows that for all 
large enough k we have 


moy > $k!(elk + DET? > Ge). 


Hence, lim SUp,. 00/54," > 1/Ge). As & > O is arbitrarily small, this completes 
the proof. O 


PROPOSITION A.2. A symmetric measure yy with even moments given 
by (3.8) is not unimodal and has unbounded support. 


HANKEL, MARKOV, TOEPLITZ MATRICES 33 


PROOF. Suppose that the symmetric distribution yg is unimodal. Since all 
moments of yz are finite, from Khinchin's theorem (see [14], Theorem 4.5.1), it 
follows that if $ (t) = f e!* yg (dx) denotes the characteristic function of yg, then 
g(t) = b(t) J- tó'(t) must be a characteristic function, too. The even moments cor- 
responding to g(t) are (2k + 1)mo& (yg), and must be a positive definite sequence, 
that is, the Hankel matrices with entries [i + j) — 3)mai+j—2)(VA)]1<i,j<n 
should all be nonnegative definite. However, with m4 = 2, mg = 11/2 and mg = 
281/15, for n — 3 the determinant 


1 3m. 5m4 1 3 10 
det | 355 5m4 7mg | =det| 3 10 77/2 | = —73/20 
5m4 "Img 9mg 10 77/2 843/5 


is negative. Thus, yg is not unimodal. 

To show that the support of yy is unbounded we proceed like in the Toeplitz 
case. The main technical obstacle is that some partition words contribute zero 
volume. We will therefore have to find enough partition words that contribute a 
nonzero volume, and then give a lower bound for this contribution. 

We consider only moments of order 4k — 2, k > 2, and find the contribution of 
the partition words which have no repeated letters in the first half, that is, 


w[1] # wI2] Z ::: A w[2k — 1]. 


That is, we consider the set of partition words w of length 4k — 2 of the form 
w — abc... with the first 2k — 1 letters written in the fixed (alphabetic) order, 
followed by the repeated letters a, b, c,... at positions 2k,..., 4k — 2. We also 
require that the repeats are placed at odd distance from the original matching letter. 
Formally, we consider the set of partition words w of length 4k — 2 which satisfy 
the following condition: 


If wla] = w[B] and a < f, then a Æ B8 mod2, a < 2k — 1 and B > 2k. 


Since we can permute all letters at locations 2k, 2k + 2, ..., 4k — 2 and all letters 
at locations 2k + 1, 2k +3,...,4k — 3, clearly there are k!(k — 1)! such partition 
words. 

To show that all such partition words contribute a nonzero volume, we need to 
carefully analyze the matrix of the resulting system of equations (3.6). This is a 
(2k — 1) x (4k — 1) matrix with entries 0, +1 only. The first 2k — 1 columns of 
the matrix are filled in with the pattern of sliding pairs 1, 1 corresponding to first 
occurrences of every letter, that is, the left-hand sides of equations (3.6) are simply 


xo + X1 BE 
X1 + X2 


i 


Xok—2 + Xok—1 — .... 


34 W. BRYC, A. DEMBO AND T. JIANG 


So the first 2k columns of the matrix are as follows, with the star denoting as yet 
unspecified entries of the 2kth column. 


2100 700% 
OLED -007 
0011..00* 
0000..11* 
0000..011 


The remaining columns are as follows. In every even row of the second half we 
have disjoint (nonoverlapping) pairs (—1, —1), including the site adjacent to the 
"Jast letter," that has entry 1 in the last row, and entry —1 in one of the odd rows. 
None of these —1, —1 are in the last column, a coefficient of x44. 2. 

In the odd rows we have pairs of consecutive (—1, —1) which overlap entries 
from the even rows, but not themselves, including a single (—1, —1) pair which 
fills in one spot in the last column, the coefficients of x4,—2. 

For example, the word w — abc...abc..., where all 2k — 1 letters a, b,c,... 
are repeated alphabetically twice, is in the class of the partition words under con- 
sideration. The corresponding system of equations is 


XQ + x1 = X2k—1 + X2k 
Xi + Xia = X2k+i—1 + X2k+i, i= 1,2,...,2k — 3, 


X2k—2 F Xok—1 = X4k—3 + X4k—2; 


and its matrix is 


1100..00-1-1 0 0 0 
0110..00 0-1-1 0 0 
0011..00 0 0-1 0 0 
0000..11.0 0 0 ass= 0 
00002401 I 0 D esemi- 


All other partition words in our class are obtained from permuting letters 
w[2k], w[2k + 2], ..., w[4k — 2], and then permuting letters w[2k + 1], w[2k + 
3],..., w[4k — 3] of w = abc...abc. Thus all other systems of equations are 
obtained from the above one by permuting even rows in columns 2k + 1, 2k + 
2, ..., 4k — 2 and odd rows in columns 2k, 2k + 1,..., 4k — 1 (apart from the 1 at 
column 2k and row 2k — 1 which is never permuted, but gets eliminated if the first 
row permutes to become the last one). For each of these words the sum of all odd 
rows in the system minus the sum of all even rows is [1, 0,..., 0, —1], implying 


HANKEL, MARKOV, TOEPLITZ MATRICES 35 


that for such w the additional constraint xo = x4y 2 we require when computing 
Pu (w) is merely a consequence of (3.6). 

The solutions of equations (3.6) for such partition words w are easy to analyze 
due to parity considerations. Gaussian elimination consists bere of subtractions of 
the given row from the row directly above it, starting with the subtraction of the 
(2k — 1) row and ending with the subtraction of the second row from the first row, 
at which point the first 2k — 1 columns become the identity matrix. During these 
subtractions, a —1 entry in each column of the original system can meet a nonzero 
entry only from a row positioned at an odd distance above it, in which case they 
cancel each other. So as we keep subtracting, all coefficients take values 0, +1 
only. Further, for each row the sum of the entries in columns 2&,..., 4k — 1 is —2, 
except for the last row for which it is —1. Thus, after all subtractions have been 
made, these sums are —1 at each of the rows. We can now set the 2k undetermined 
variables to 1.1.d. U[O, 1] random variables, x2,; = Uo, ..., X4k—-2 = U2k—1, and 
solve the 2k — 1 equations for the dependent variables xo, . .. , x24-2. By the above 
considerations we know that each of these dependent random variables is ex- 
pressed as an alternating sum of independent uniform U [O, 1] random variables 
of the form (A.3). 

The argument we used for deriving (A.5) thus gives the bound py(w) > 
5 Qke) ?* for each of these k!(k — 1)! partition words, and hence for all k large 
enough, we have 


mak-a(yg) > Fkk — 1)! Qek) 7* > (66) 7*. 


Thus m a — co, which implies that the support of yy is unbounded. [1 

PROPOSITION A.3. The free convolution yy = yo Hl yı of the standard semi- 
circle distribution yo and the standard normal y, is a symmetric measure, deter- 
mined by moments, has unbounded support and a smooth bounded density. 


PROOF. By Corollary 2 in [2], ym has a density, by Corollary 4 in [2] the 
density is smooth and by Proposition 5 in [2] it is bounded. 

We now verify that y; is determined by moments and has unbounded support. 
We need the following observation: a probability measure js has odd moments 
vanishing iff the odd free cumulants ky-4.1 (4) of u vanish. This can be easily read 
from formula (72) in [22]. 

Since free cumulants linearize the free convolution, k, (yy) = ky (vo) + kr (yi). 
This shows that the odd moments of ym vanish. Recall that the free cumulants 
k,(u) and the moments m,(j) of a probability measure u are related by for- 
mula (72) in (22]. In particular, for ~ with vanishing odd moments, the even cu- 
mulants kz; (14) are related to the moments by the equations 


n 2r 
(A6 m»)-23 kwQ) >  [[mjQ 2=1,2,.... 


ral ijt +igp==2n—2r j=l 


36 W. BRYC, A. DEMBO AND T. JIANG 


By symmetry, the odd cumulants of y; vanish, and kz, (y1) are nonnegative; 
ko, (yı) count all irreducible pair partitions of (1,..., 2r) (see [5], page 152). Since 
k2(yo) = 1, and all higher free cumulants of yo vanish (see [12], Example 2.4.6), 
we have 


kor(y1) € kar (ym) € 2kor (11). 
Together with (A.6) this implies by induction that 


ma,(y1) < mo, (yu) € 4 mar (yi). 


In particular, y has unbounded support and is uniquely determined by moments. 
Since its odd cumulants vanish, the odd moments vanish and yay is symmetric. O 


A.2. Moments of free convolution. In this section we identify moments of 
the free convolution yo Hi y1. The result and the method of proof were suggested 
by Bozejko and Speicher [5], who give a combinatorial expression for the moments 
of free convolutions of normal densities. 

Denote by W the set of all partition words. Recall that a (partition) subword of 
a word w is a partition word w; such that w — a...cwid..z. Let Wo be the set 
of all irreducible partition words, that is, words that have no proper (nonempty) 
partition subwords. 


DEFINITION A.1 ([5]). We say that p: 'W — R is pyramidally multiplica- 
tive, if for every w € W of the form w - a...cwid..z, we have p(w) = 
p(wi)p(a...cd..z). 


LEMMA A.4 ([5], page 152). Suppose that the moments are given by 
(A-7) 00 mm= È pw), 


we W,|w|=2n 


and m2n—1 =0,n=1,2,.... If the weights p(w) are pyramidally multiplicative, 
then the free cumulants are 


kn= >) pw) 


we Wo,|wi=2n 


PROPOSITION A.5. A symmetric measure yy with the even moments given 
by (3.1) is given by the free convolution yy = yo Æ yı. 


PROOF. We apply Lemma A.4 to measures yy, yo and yi. If w =..w .., then 
h(w) = h(w 1) + h(w \ w), so the Markov weights py(w) :— 2h(w) are pyrami- 
dally multiplicative. It is well known that the moments of the normal distribution 
are given by (A.7) with pi(w) = 1, which is (trivially) multiplicative. The mo- 
ments of the semicircle distribution are given by (A.7) with po(w) = 1 for the 


HANKEL, MARKOV, TOEPLITZ MATRICES 37 


so-called noncrossing words, and po(w) = 0 otherwise. (A partition word is non- 
crossing, if it can be reduced to the empty word by removing pairs of consecutive 
double letters xx, one at a time.) It is well known that this weight is pyramidally 
multiplicative, too. 

We now use Lemma A.4 to compare the free cumulants of the semicircle, nor- 
mal and Markov distributions. Let w € Wp. If |w] = 2, then py (w) = 2, and oth- 
erwise py (w) — 29 — 1 as an irreducible word has no proper subwords, and hence 
no encapsulated subwords. Thus k2(yy) = 2, and for n > 2 


kon (ym) = 8(w € Wo, |w| = 2n]. 


If |w] = 2, then po(aa) = 1, and otherwise po(w) = 0 as an irreducible word of 
length 4 or more cannot be noncrossing. Thus k2(yo) = 1, and for n 7 2 


kon (yo) = 0. 
From pı (w) = 1 we get 
kon (y1) = #{w € Wo, |w| = 2n] 
for n > 1; in particular, ko(y1) = 1. Thus, for n > 1 


kon (M) = kan yo) + kon (V1), 
which proves that yy = yo Œ yı. O 


Acknowledgments. Part of the research of WB was conducted while visit- 
ing the Department of Statistics of Stanford University. The authors thank Marek 
Bozejko, Persi Diaconis, J. T. King, Rafat Latafa, Qiman Shao, Ronald Speicher 
and Richard P. Stanley for helpful comments, references and encouragement, Ofer 
Zeitouni for a shorter proof of Theorem 1.3 and additional comments, A. Sakha- 
nenko for electronic access to his papers, and Steven Miller, Chris Hammond and 
Arup Bose for information about their research. 


REFERENCES 


[1] BAr, Z. D. (1999). Methodologies in spectral analysis of large-dimensional random matrices, 
a review. Statist. Sinica 9 611-677. MR1711663 

[2] BIANE, P. (1997). On the free convolution with a semi-circular distribution. Indiana Univ. 
Math. J. 46 705—718. MR1488333 

[3] BOSE, A., CHATTERJEE, S. and GANGOPADHYAY, S. (2003). Limiting spectral distributions 
of large dimensional random matrices. J. Indian Statist. Assoc. 41 221-259. MR2101995 

[4] BOSE, A. and MITRA, J. (2002). Limiting spectral distribution of a special circulant. Statist. 
Probab. Lett. 60 111—120. MR1945684 

[5] BOoZEJKO, M. and SPEICHER, R. (1996). Interpolations between bosonic and fermionic rela- 
tions given by generalized Brownian motions. Math. Z. 222 135-159. MR1388006 

[6] BRYC, W., DEMBO, A. and JIANG, T. (2003). Spectral measure of large random Hankel, 
Markov and Toeplitz matrices. Expanded version available at http://arxiv.org/abs/math. 
PR/0307330. 


38 W. BRYC, A. DEMBO AND T. JIANG 


[7] DIACONIS, P. (2003). Patterns in eigenvalues: The 70th Josiah Willard Gibbs lecture. Bull. 
Amer. Math. Soc. (N.S.) 40 155—178 (electronic). MR1962294 
[8] DUDLEY, R. M. (2002). Real Analysis and Probability. Cambridge Univ. Press. MR1932358 
[9] FULTON, W. (2000). Eigenvalues, invariant factors, highest weights, and Schubert calculus. 
Bull. Amer. Math. Soc. (N.S.) 37 209—249 (electronic). MR1754641 
[10] GRENANDER, U. and SzEGÓ, G. (1984). Toeplitz Forms and Their Applications, 2nd ed. 
Chelsea, New York. MR890515 
[11] HAMMOND, C. and MILLER, S. (2005). Eigenvalue density distribution for real symmetric 
Toeplitz ensembles. J. Theoret. Probab. 18 537—566. 
[12] HIAI, F. and PETZ, D. (2000). The Semicircle Law, Free Random Variables and Entropy. Amer. 
Math. Soc., Providence, RI. MR1746976 
[13] LIDSKIÏ, V. B. (1950). On the characteristic numbers of the sum and product of symmetric 
matrices. Dokl. Akad. Nauk SSSR (N.S.) 75 769—772. MR39686 
[14] LUKACS, E. (1970). Characteristic Functions, 2nd ed. Hafner, New York. MR346874 
[15] MOHAR, B. (1991). The Laplacian spectrum of graphs. In Graph Theory, Combinatorics, and 
Applications 2 871-898. Wiley, New York. MR1170831 
[16] NICOLAS, J.-L. (1992). An integral representation for Eulerian numbers. In Sets, Graphs and 
Numbers. Coliog. Math. Soc. János Bolyai 60 513-527. North-Holland, Amsterdam. 
MR1218216 
[17] PASTUR, L. and VASILCHUK, V. (2000). On the law of addition of random matrices. Comm. 
Math. Phys. 214 249—286. MR1796022 
[18] SAKHANENKO, A. I. (1985). Estimates in an invariance principle. In Limit Theorems of Prob- 
ability Theory. Trudi Inst. Math. 5 27-44, 175. Nauka, Novosibirsk. MR821751 
[19] SAKHANENKO, A. I. (1991). On the accuracy of normal approximation in the invariance prin- 
ciple. Siberian Adv. Math. 1 58-91. MR1138005 
[20] SEN, A. and SRIVASTAVA, M. (1990). Regression Analysis. Springer, New York. MR1063855 
[21] SERRE, J.-P. (1997). Répartition asymptotique des valeurs propres de l'opérateur de Hecke Tp. 
J. Amer. Math. Soc. 10 75-102. MR1396897 
[22] SPEICHER, R. (1997). Free probability theory and non-crossing partitions. Sém. Lothar. Com- 
bin. 39 Art. B39c (electronic). MR1490288 
[23] TANNY, S. (1973). A probabilistic interpretation of Eulerian numbers. Duke Math. J. 40 
717—722. MR340045 
[24] WIGNER, E. P. (1958). On the distribution of the roots of.certain symmetric matrices. Ánn. of 
Math. (2) 67 325-327. MR95527 


W, BRYC A. DEMBO 
DEPARTMENT OF MATHEMATICS DEPARTMENT OF STATISTICS 
UNIVERSITY OF CINCINNATI AND DEPARTMENT OF MATHEMATICS 
P.O. Box 210025 STANFORD UNIVERSITY 
CINCINNATI, OHIO 45221 STANFORD, CALIFORNIA 94305 
USA USA 
E-MAIL: wlodzimierz.bryc@uc.edu E-MAIL: amirG math.stanford.edu 
URL: math.uc.edu/-brycw/ URL: www-stat.stanford.edu/-amir/ 
T. HANG 
SCHOOL OF STATISTICS 


313 FORD HALL 

224 CHURCH STREET S.E. 
MINNEAPOLIS, MINNESOTA 55455 
USA 

E-MAIL: tjiang @stat.umn.edu 
URL: www.stat.umn.edu/~tjiang/ 


The Annals of Probability 

2006, Vol. 34, No. 1, 39-79 

DOI: 10.1214/009117905000000459 

© Institute of Mathematical Statistics, 2006 


ON RANDOM ALMOST PERIODIC TRIGONOMETRIC 
POLYNOMIALS AND APPLICATIONS 
TO ERGODIC THEORY 


By Guy COHEN! AND CHRISTOPHE CUNY? 


Ben-Gurion University 


We study random exponential sums of the form )°7_., Xy exp{i ca? ty + 
... + t,)}, where {Xn} is a sequence of random variables and (44) : 1 < 
i < s} are sequences of real numbers. We obtain uniform estimates (on com- 
pact sets) of such sums, for independent centered (X5) or bounded {Xp} sat- 
isfying some mixing conditions. These results generalize recent results of 
Weber [Math. Inequal. Appl. 3 (2000) 443—457] and Fan and Schneider [Ann. 
Inst. H. Poincaré Probab. Statist. 39 (2003) 193--216] in several directions. 
As applications we derive conditions for uniform convergence of these sums 
on compact sets. We also obtain random ergodic theorems for finitely many 
commuting measure-preserving point transformations of a probability space. 
Finally, we show how some of our results allow to derive the Wiener-Wintner 
property (introduced by Assani [Ergodic Theory Dynam. Systems 23 (2003) 
1637—1654]) for certain functions on certain dynamical systems. 


1. Introduction. In their pioneering work, Paley and Zygmund [29] studied 
Fourier series whose terms have random signs, that is, random Fourier series of the 
form Y? , ease"! , where {en} is a Rademacher sequence (i.i.d. random variables 
taking the values +1 with probability 5), and {an} is a complex sequence. This 
research was continued by Salem and Zygmund in [34]. 

In this paper we obtain uniform estimates of multidimensional random expo- 


nential sums of the form »77. , Xze! On AE 5) where {Xn} is a sequence of 
random variables and (a, :1 <i x sj are sequences of real numbers. Estimations 
of this kind were obtained (for the one-dimensional case) in [29] and [34], and 
were extended recently in several directions in [16] and [37]. 


Received August 2004; revised February 2005. 

l Supported in part by Israel Science Foundation Grant 235/01 and by a Skirball Postdoctoral Fel- 
lowship of the Center of Advanced Studies in Mathematics of Ben-Gurion University. Also supported 
by FWF Project P16004—N05 from the E. Schréedinger Institute, Viena. 

2Supported in part by a Skirball Postdoctoral Fellowship of the Center of Advanced Studies in 
Mathematics of Ben-Gurion University. 

AMS 2000 subject classifications. Primary 37A50, 60F15; secondary 47A35, 42A05. 
Key words and phrases. Moment inequalities, maximal inequalities, almost everywhere conver- 
gence, random Fourier series, Banach-valued random variables. 


39 


40 G. COHEN AND C. CUNY 


Such estimates are useful, for instance, to study almost sure (a.s.) uniform con- 
vergence of certain random Fourier or random almost periodic series (see, e.g., 
[12, 16, 20, 34, 37]) and have applications in (random) ergodic theory (e.g., [8]). 

Let us be more precise, concerning this latter point. 

In the last decades, many authors worked on ergodic theorems with random 
modulation (sometimes called "randomly weighted ergodic theorems"). One im- 
portant matter may be formulated as follows: Given a sequence {X,,} on a proba- 
bility space (Q, u, F), find a measurable set Q* C Q with ~(Q*) = 1, such that 
for any œw € Q* the sequence a, :— X, (o) is a universally good weight sequence 
for the ergodic theorem for all functions in some specified class. More precisely, 
one wants that for any measure-preserving transformation t on a probability space 
(Y, 3,7), and any function f on r with a certain integrability property (e.g., 
f € Ly), the sequence - L y ay f o t* converges zr-a.e. 

One main tool in the study of such questions (and related ones) is the use of the 
spectral theorem, which transfers the problem to the study of uniform estimates 
of random trigonometric polynomials. It seems that the first use of the spectral 
theorem in this random context appeared in [33], on the base of the results of [34], 
mentioned above. Then, many authors investigated this direction (see, e.g., [1, 2, 
4, 8, 11, 35, 37]. 

Another tool, mainly introduced by Rosenblatt [33] in this context (see also the 
later papers [4] or [7]), is Stein’s interpolation theorem, which needs estimates on 
partial sums of Dirichlet series. 

Actually, it seems that what is really needed in order to use Stein's interpolation 
theorem is to estimate general exponential sums involving Fourier and Dirichlet 
terms. 

This paper may be divided into two parts, estimates and convergence results. 
First we obtain new estimates, uniform on compacta, for random almost periodic 
polynomials. Our main result in this direction is the following (see Section 3): 


THEOREM 1.1. Let {X,} be a sequence of random variables, defined on a 
probability space (82, u). Let (aO buses (85 be sequences of real numbers. 


(i) If {Xn} are complex-valued, symmetric and independent, then there exist 
some constants C,& > 0, independent of {Xn}, such that (with 0/0 interpreted 
sup max supex 
Senn 2 2 l (m8 wee T,T} 


as 1) 
) 
(1) x (Ro log(T + 1) 


=] 
x log(m v max max ae) | 


<k<m 1<i<s 


Y X yel OE tite tag ts) 
k=n-+1 














<C, 
Liu) 








RANDOM TRIGONOMETRIC POLYNOMIALS 41 


where Ry, m = > n 4.1 |X cl. In particular, for a.e. œw € $2 we have 


OD 4. pea 


m>nT>1 Rn,m log(T 4- 1) log(m V MaX] «km Max{ <j<s IA 


(i) If (X4) C L2(&2, u) are centered independent, then (1) and (2) remain true 
with Rim = kent} IXa! T E|X4l^. 

(ui) If {Xn} is a bounded martingale difference sequence, then (1) and (2) re- 
main true with Ram = Y 5. NXtlleo- 


We also obtain similar results as in Theorem 1.1 for centered complex bounded 
random variables which are not necessarily independent. In this case, the quanti- 
ties Ry m involve some (uniform) correlation coefficients. 

This theorem generalizes recent results of Weber [37] and Fan and 
Schneider [16], the first of which is a one-dimensional version of Theorem 1.1 
for periodic polynomials with independent symmetric coefficients, that is, s = 1 
and {Àn} is a nondecreasing sequence of natural numbers. The paper of Fan 
and Schneider gives similar estimations (in a one-dimensional setting) only with 
L'(2)-integrability, while we obtain here an Orlicz space integrability, defined by 


the function e*^ — 1. Moreover, Theorem 1.1 shows that Theorem 1 of [16] holds 
without their (quite restrictive) condition (V). We would like also to underline the 
fact that in (1) we take the supremum over T inside the integral. This seems to be 
crucial in the applications such as Stein's interpolation, see Theorem 5.3 below, 
or for random ergodic theorems for flows that we will explore in a forthcoming 
paper [6]. 

Theorem 1.1 is also a generalization of a well-known theorem of Salem and 
Zygmund [34] for Fourier series whose terms have random signs. Actually, our 
proof relies on ideas of [34]. It turns out that estimates like (1) are really of a 
probabilistic nature. In particular, the power of Bernstein's inequality, which was 
used in [34] (to deal with random trigonometric polynomials), is not needed. It is 
this remark that allows one to consider general sequences (AU, kh fa} which 
are not necessarily integer-valued, and which are not required to satisfy any further 
assumptions. 

To reach the case of general random variables, that is, random variables which 
are not necessarily independent, we use recent results of [13] (see Proposition 2.5) 
which permit to deduce exponential inequalities of Azuma’s type. 

In the second part, we first use our estimates to obtain a.s. uniform conver- 
gence on compacta of certain random series of functions of the form $79 , X, x 


PET (5) 
ein trt ts), when {Xn} are general bounded or independent (either centered 


or symmetric). For these results see Section 4.1. 
Then we apply our results to ergodic theory. A special case of Theorem 1.1, due 
to [37], was used in [8] to prove the following: 


42 G. COHEN AND C. CUNY 


Let {Xn} C L2(&2, u) be a sequence of centered independent random variables, 
such that 


oo 

> lIX«l;dogn)? < oo. 

n=} . 
Then almost surely the sequence an = X;,(@) has the following property: for any 
contraction T on L(Y, x) of a probability space and f € L2(Y, x), the series 
nei Xn(0)T" f converges x -a.e. 


This result raises some questions. If T is induced by a probability-preserving 
transformation 7, what can be said about functions not in L2, but in some Ly, 
l <q <2? What if we take a sequence of powers {jn}? Are there analogues for 
several commuting transformations? These questions are answered by our main 
application of Theorem 1.1 in the following (see Section 5): 


THEOREM 1.2. Let {Xn} C L2(Q, u) be a sequence of centered independent 
random variables, and let ( j^, TT je? } be sequences of natural numbers. As- 


sume that the Mc 


D | X,, IBeg(n v. v max max if? ) dogn)? 
"zi <k<n i<ix<s 

converges. Then there exists a set Q* C Q of full measure, such that for every 
w € Q* we have the following: for every commuting family of measure-preserving 
transformations t1, ..., ty on a probability space (Y,x), and any f € Lg(Y,7), 
1 <q <2, the series 


(1) ; 
NELLO AE Gc pat 
n (2-—4)/24 


n=} 
converges 3t -a.e. 


The special case of Theorem 1.2, with s = 1, jy =n and q = 2, is also a special 
case of the result of [8] quoted above (for extensions to the case of two commuting 
contractions see Theorem 5.2 below). 

We conclude the paper by showing connections of our results with the Wiener- 
Wintner property introduced by Assani in [2]; see Section 6. 


2. Preliminary estimates. 


2.1. Estimates for almost periodic polynomials. In [34], Lemma 4.2.3, Salem 
and Zygmund used Bernstein's inequality to compare the maximum of a trigono- 
metric polynomial with its values on a certain interval. Then they obtained a very 
sharp result; it seems that in their application, the full strength of their result is not 
used. We give here some elementary estimates which will be useful in the study of 
general exponential sums. 


RANDOM TRIGONOMETRIC POLYNOMIALS 43 


LEMMA 2.1. Let {œn} be a sequence of complex numbers, and let {An} be 
a sequence of real numbers. For any n > 1 put P (t) = Y i oe". Then, for 
every integer m > 1 and T > Q, we have: 


(i) Maxj<n<m max;ep-r,r] |P,()| < 3m maxi<n<m [(]An|) - Maxi<n<m 
maxX;e[— 7, T] |Pn()|. 


If in addition {Àn} is positive and nondecreasing, then 


(ii) max, max;ep—r,r] | P, (£)| < 2Am MaX1<n<m maxre- T, T] | Ps (£)]. 


PROOF. Define Po = Q. Fort e [—T,T] and 1 <n x m, we have 


P (t) —i So Agope zd »» Àk (P(t) — Py (t)) 


k=l k=1 
n~1 

=i Y Py(t)(Ak ~ Angi) iP). 
kl 


Hence, the results clearly follow. 1 


REMARKS. 1. When {A,,} is a sequence of natural numbers and T = x , Bern- 
stein's inequality yields that for any n > 1, 
max |P,()|x max {Àg}. max |P,(t)]. 
te[—7,7] leken tel~-x,z] 
It is clearly much stronger than (1), and in the monotonic case it is also stronger 
than (ii). 

2. The idea behind the previous simple lemma is to bypass the use of Bernstein's 
approximation for the derivative of a trigonometric polynomial, in order to over- 
come the difficulties which appear in trying to extend such an approximation for 
almost periodic polynomial. Since in this paper the sequence {œp} will always rep- 
resent a realization of some random variables, it turns out that the obtained result 
is sufficient for our needs. 


NOTATION. For a positive sequence {cn}, we define c7, == maxj<n<m Cn- 


LEMMA 2.2. Let {a,} be a sequence of complex numbers, and let 
(A C9 },. .., AS} be sequences of real numbers. Put Py(ti,...,ts) = Y g1 ok X 


(1) (s) 
eA fiM t), Then for every T > 0 and for every integer m > 1, there exists 


a rectangle I C (—T, TY, with area |1|, satisfying |I| > [Tj min — Lr T), 
such that for every (t1, ...,t) € I we have 
max |P,(t1,...,¢ >l max max PL sss s) 
| n (ti s)| bin M (ui, M) eL - T, TT | n( 1 s)| 


44 G. COHEN AND C. CUNY 


If in addition APh Ou] are all positive and nondecreasing, then we can 


take I with |I| > Da miniy aA T}. 


PROOF. We first prove the nonmonotonic case. If for some 1 <i <s we have 
[AQ |* = 0, then the polynomials (P4)7. , are all constant with respect to the argu- 
ment /;. So, we assume that aw I* 40 for any 1 <i<s. 

There exist 1 € no < m and (u1,...,u;) € [-T, T], such that 


= 4 = Pry en 
M i= (Prous... w)]— max max o py PnC eesto) 


Let (ti, ..., ts) € [-T, TP. We have 
l - o Pno / / 
Po (tis «s ts) — Png (Uy +++ s) =) ti = ui) 7 1 esf), 
i=] , 


where (t,...,1;) is on the line segment joining (ui,...,u;) and (t,...,t5). 
Hence using Lemma 2.1(1) 


M — | Pho (tt, TESI 
m: || Pao (ti, ——9À = | Pao (U1, wees Us)|| 
< | Pro (t3, isa lg) = Puno, saaa Hg) 


2» — uil 


"T max — max, qu IP Q1 £s e DEI min] e 


l<n<m v,€[-T 





Co : a4) 








+ 3m|a|*. max max |P (t5, £5, ... t1 1 Vol- Ms — us] 


l<n<m Us €[—T, ] 


S 
<3mM SOE | lt — uil. 
i=l 


Put 
: 1 
poss |a. es 5) ELT, TF: |t — uil < min} — y rli e -— 
6s - m|As, |* 
For any (ti, ..., ts) € 1 we have 


M 
max | Pn (f, ines dei > Pu ti, ss 15) 2 
l<n<m 2 


RANDOM TRIGONOMETRIC POLYNOMIALS 45 


The monotonic case follows in a similar way. Using Lemma 2.1(ii) we obtain 
|| Pao (t, see , ts)| ae | Pag (ui, TEF us)|| 


«210 max max. py lP eb Ini 
l<n<m vie[—-T 


+22) max 5 IP (th £5, ..., t qo Vs)l + [ts — usl. 


1znxm veel T. 


In that case we put 


] 
I: e... to et-r. TY :min| t — Hil Sites]. n 
ÅS Àm 


DEFINITION 2.1. Let K be a separable compact topological space, and let v 
be a Borel measure on K. Let {fn} be a sequence of complex continuous func- 
tions on K, such that sup,., maxxex | fn (x)| < oo. Let {on} be a nondecreas- 
ing sequence, with o; > 1. We say that the sequence {fn} forms a {o,}-system 
on (K, v) if there exist some constants p, > 0 and 0 < p2 < 1, such that for every 
sequence {apn} of complex Sala, and for every m > 1, there exists a measurable 
set ln C K, with VUm) = = + such that 


> m max mas 
l<n<m 


for every x € Ím. 








232 ak fi (x) 








Daho) 


i 


REMARKS. 1. By taking (a4) with zero terms, we can consider the inequality 
above on any blocks. 

2. Definition 2.1 is inspired by the general observation made in [20], Theorem 1, 
page 68. 


EXAMPLE 2.1. Let v be bs Lebesgue measure on R?, and let T > 1. By 
Lemma 2.2, the sequence fein ‘+ HAR ts)) forms a {(6sn)°* 5 I*-- 1)}- 
system on ([—T, TT, v). If (4 (D Josh NIV are all positive and moments: it 
forms a ((4s)* TE... AW? + DoI 


DEFINITION 2.2. Let K be a separable topological space, and let v be a Borel 
measure on K. Let (K,)??, C K be a sequence of compact subspaces of K. We 
say that {f,} forms a uniform (0n)-system on ((K,, v))72., if for every r > 1 the 
sequence {fn} forms a (o4]-system on (K,, v) with the same corresponding con- 
stants o1 and p2. 


EXAMPLE 2.2. Foreveryr > 1 put Vim = [—r,r]*. Using 2.2 (see also Exam- 
ple 2.1 above), the sequence fein tb AR ts) forms a uniform {n° [E (A4 |* + 
1)}-system on {(K,, v))72,. If Di sd are all positive nondecreasing, it 
forms a uniform TEGO + Don 


46 G. COHEN AND C. CUNY 


Let {Xn} be a sequence of complex random variables on (Q, jz). Given a se- 
o { Jk C C(K), for any 0 < j </, we define the random continuous function 
Sj = =; -j41 Xk fk. By separability of K and continuity in x € K for each fixed 


w € $2, we can compute ||S; ;|| = maxyex | RE H4 Xk(@) fx (x)| as the supremum 
over a fixed countable dense subset of K, so ||S; || is measurable, and Sj; is a 
C(K )-valued random variable. 


LEMMA 2.3. Let {X,} bea sequence of complex random variables on (&2, u), 
and let { fn} be a (o4]-system on (K, v), with constants pı and p» as above. Then 
for every positive nondecreasing function wr, we have 


0, 
E( max max (18,460)) <2 f. E( max y (=S) t» 
n<k<m x€ p) JK \n<ksm 
PROOF. Since {fn} forms a {o,,}-system on (K, v), for every m > 1, and every 
w € Q, there exists a Borel measurable set J,(w) C K, with v(Ig(o)) > 01/05, 
such that for every x € K we have 


1 
v( max max |$, 40211.) < = v(— max |Sn Ko)? 
n<k<m ye po nek< 
Integrating this inequality on K, and — the monotonicity of y, we obtain 
max max (15,403) < ŽE f. max. v (718,4 l)a. 
n<k<m yc K n<k<m 


Taking the expectation yields the has z 


2.2. Moment inequalities for the partial sums. We recall here some results 
that we will use in Section 3. The following lemma is basically Lemma 3 in [29], 
part I. 


LEMMA 2.4. Let Z be a nonnegative random variable on (Q, p), and let Cy 


and C» be some Dose constants. If f Z^ dy < C4(Con)" for every n > 1, then 


fexp(6Z*) du <1 + 4-5 ies , for every 5 < as 


PROOF. Using the estimation n! > A/2z:n^^*1/2g7nt1/(02n1) (see [17], page 
52), we have 


f 96254 - f (i 4 Y ee) au 





(5C2)"n" (8C) n” e” 
a ee ET 
n=l n=l "y/n 
«i-4Ci Y cesca" «1c ML. 
~ i 1 — eCa [] 


n=l 


RANDOM TRIGONOMETRIC POLYNOMIALS 47 


Let {Xn} be a sequence of random variables. For any k > 1, let Fp := 
O (X1,..., Xg) be the o-algebra generated by {X1,..., Xg}. The following result 
was obtained by Dedecker. 


PROPOSITION 2.5 ([13], Proposition 1). Let 2 < p < oo, and let {Xn} C 
L p(&2, 4) be a sequence of real centered random variables. Then we have 


n n i—i 1/2 
S Xd 2 ap WXFllp2 - 3; 9 XE: Fola) 
k—i p 


i=2 k=l 


(3) 














REMARKS. 1. If (X,] are bounded centered random variables, and in the 
above inequality we put the L.o-norm in the right-hand side, then Dedecker's re- 
sult can be deduced from Theorem 2.4 in [32], page 42. 

2. Let X € L,(Q, u), 1 <r x oo, and let F C F’ be two o-algebras of Q. 
Since the conditional expectation, with respect to a fixed o-algebra, contracts the 
L,-norm, we have 


(4) ECCL] = [EEF IF), < EXIF’). 


As a consequence of (4), we may replace F, in Dedecker's result by any o -algebra 
F, H with respect to which {X1, ..., X4] are measurable. In particular, by usual com- 
plexification, it is easy to obtain Dedecker's result for {X,} with complex values. 
In that case, a factor 2 should be added in front of the right-hand side of (3). 

3. As noticed in [13], this result contains Burkholder's inequality for martingale 
difference sequence. 

4. An inspection of the proof of Proposition 1 in [13] shows that the "centered" 
assumption is not needed. 


The following maximal inequality was obtained by Móricz. 
PROPOSITION 2.6 ([26], Theorem 1). Let 1 < p < oo, and let {Xn} C 


L p(82, u) be a sequence of complex random variables. Assume there exist non- 
negative numbers {œn}, and some positive constants C and q > 1, such that 


l 
P 














P i q 
«e >», a for every l > j > 0. 
k 


Then for any m >n >O, 
l p m q 
max Xk se 25 2r 
n<l<m 
kn-4-1 zn4-1 




















= l = 
where Cp,q = C(1 = 336-177) P, 


48 G. COHEN AND C. CUNY 


REMARKS. 1. Originally, Theorem 1 of [26] is stated for real random vari- 
ables but it extends easily to the case of complex ones. More generally, it extends 
to the case of Banach-valued random variables. 

2. Fix m > n > 0, and assume that in the assumptions of Proposition 2.6 we 
only have 


l 
2, Xi 


p l q 
«e 3 3 forevery n x j «lm. 
kj 


p k=j+1 














This will allow us to estimate the L p-norm of max, <i<m | 23 PT Xy. Indeed, we 
put X; = X, and o, = oy for n < k < m, and otherwise we put X, = 0 and a, = 0. 
Now we apply Proposition 2.6 to (X,} and (v; (see also [8], Proposition 2.3). 

3. For fixed 1 < p < oo, the quantity Cp, tends monotonically to infinity 
when q y 1. ) 

4. For q = 1, Proposition 2.6 is no longer true. Under the same assumptions, but 
with q — 1, we have (see Theorem 3 in [26] or Proposition 2.2 in [8]) 


l P 


Y x 


k=n+1 


(5) 


max 
n<l<m 


m 
<C(2+logym)? > on, 
p k—n4-1 




















forany m >n 2 0. 


The proofs of (5) (as given in [26] or [8]) extend to the case of Banach-valued 
random variables. 


3. Uniform estimates for random polynomials. Let {X,} C Lo; (Q, u) be 
a sequence of complex centered random variables and let {f} C C(K). For any 
m >n > 0 define S, 4 (x) := Vena Xk fk (x) and 


m m  i—i 
(6) Ram: = Y dXilas-- X Y. XE Fo loo. 
įi=n+ i i=n+2 k=i 


THEOREM 3.1. Let {Xn} C Los (Q, u) be a sequence of complex centered 
random variables. Let ( fan} be a (o4]-system on (K, v), with corresponding con- 
stants pı and p». Then for every m > n > 0 we have (with 0/0 interpreted as 1) 
|. Maxek | Dien gt Xi fe GP ! 3v (K)on 

Ram Liu) pi 











max exp 
n<l<m 


where € = p2/(6400e - MaXn<k<m maXxek | fx(x)|*). In particular, for every 


1<p<o we have 
< Cp Rn,m log(om + 1), 


Lp(u) 


max max 
n«lzm xcK 


I 
X Xh) 


kzn4-i 




















RANDOM TRIGONOMETRIC POLYNOMIALS 49 


where Cp = y [1 + 2log(e?/? + 3y(K)/p1)]/se. 


PROOF. By definition, {fa} is uniformly bounded. By homogeneity, it is 
enough to prove the theorem for the case where sup,.., maxyex |fx(x)] < 1. Fix 
x € K, and let p > 2. By application of Proposition 2.5 (see remarks 1 and 2 
after its formulation) to the sequence {Xz fx(x)} C Lo, u) we have 


[Sum (x) lp SV 8D Ry. m (x), where 


m m i-l 

(0 Rs): Y, MAOA DS YOU XefkQOE(XG fi) Fe) | 00, 

i=n+1 i=n+2k=1 
and f(x) := e(X1fi(x),..., Xx fe(x)). Clearly, we have F(x) C Fk =o (X1, 
..., Xj) for any x € K. Since X; fi(x) is measurable with respect to F(x), we 
may and do apply inequality (4) to obtain (1) with F(x) replaced by Fg. Using the 
assumption | f,(x)| x 1, we have Ry m(x) < Rn,m, where Rpm is defined by (6). 
Hence for every x € K and p > 2 we have 


| Sn,m (x) lp < y 8p Rn,m, 


for every m >n 2 0. 
For every i > 1 put 


i 
oj = X [XXE Fk) loo- 
k=l 
So, Ry,m = ? 041 €i. Fix po > 2, and take any p > po > 2. By Proposition 2.6, 
applied with q = p/2 > 1 and (oj), we obtain 


< (1— 2(1—p/2)/p)—" 8DRnm 
P 


« (1— 2(—-po/2/po) 7! IS DRam- 


For 2 < p < po, we use ||- ||p < || - Ilp)- Finally, we obtain for every 2 < p < oo 


and x e K 
s V 8PC py, Ryan, 
P 


where we can take C5, = po(1 — 2ü—po/2/ po). 
By application of Lemma 2.4, with C; = 1 and C5 = 16C py, Rn,m, we deduce 








| co 
n<l<m 








| max |5$,1(x)] 
n<l<m 


Í max expl) p = f exp(3 max Spa?) du 
n«lxm n«lxm 


1 
< | + ——________, 
- + 1—16e8C p, Rym 


50 G. COHEN AND C. CUNY 


for every à « Talo Ram’ and for every x € K. 
Let p; and p2 be the corresponding constants for the {o,,}-system ( fa}, as given 
in Definition 2.1. By Lemma 2.3, applied to the function u +> exp(6u7) for some 


2 
P» 
ô < 166, Ram’ we deduce 


J max exp(s max ShI) du 
n<l<m xeK 
Q 
6 
zm | J max exp( 15.10)? ) dudv(x) 
pi A n«l«m p> 


V(K)om ( 1 ) 
~ A ]— 16e(8/ ps )C' po Rn, m 


Put à = p3/(32eCp,Rn,m) and €p) = p2/(32eC ,). Since po > 2 is.arbitrary, 
and minpo>2{C po} < 200, we can choose po > 2 such that £ = £p, = p2/(6400e). 
This yields the first result. 

For p > 0, define the function $, (x) = (log(eP + x))? for any x > 0. Then 9; 
is concave, and (log x)? < $p(x) for x > 1. Hence, by Jensen’s inequality and the 
first result, we obtain 


Etle pI. (MaXn <l «m MaxXyeK | Sn, D? 
REIS 


<8 [ép (evo [s oret mtnen Bn y 
n,m 


3v(K)On 3 K Ed 
sé»( s ) = (es(77? + e) l 


Hence, 


max max|$, uy 
n«lzxm xcK 


2 
Ros 5 
« 7" log((e?!? + 3(K)/p1)om) 














Ec ( log(e?/? + 3v(K)/p1) 
T g log(om + 1) 


Since Om > 1, we obtain the second result with Cp as defined in the statement. CI 


Jess +1). 


REMARKS. 1. The above theorem generalizes ideas of Theorem 4.3.1 and 
Lemma 5.1.3 in [34] (see also [38], Chapter V, Theorem 8.34 and [20], Theorem 1, 
page 69). 

2. Clearly, the first statement of the theorem yields the finiteness of all the 
L y-norms. In fact, as noticed in [22], Lemma 3.7, page 66, it yields that all the 


RANDOM TRIGONOMETRIC POLYNOMIALS 51 
fre] 


L,-norms, 2 < p < oo, are comparable to the L2-norm. This in turn implies 
the comparability of all L,-norms, with p > 2. This is a weak generalization of 
the Kahane-Khinchine inequality for Rademacher series (see [23], Corollary 4.6, 
page 43 and see also [20], page 282). 

3. It is clear from Definition 2.1 that constant functions, that is, f, (x) = C 40 
for every n > 1, form a {o,,}-system for any nondecreasing [o5]. By doing this, we 
may obtain classical maximal inequalities for sums of bounded random variables. 


COROLLARY 3.2. Let 1 < p x2, and let {Xn} C Lp(&, p) be a sequence 
of complex centered independent random variables. Let (fy) be a {o,}-system 
on (K, v), with corresponding constants pi and pı. Then for every m>n> 0 
we have 


max max 
n«lzm xeK |, 


Y Xk f(x) 


k—n--1 


m I/p 
«2C ions + D( 25 xag) 


k-n--1 




















L p (u) 


where Cp = J [1 + 2log(e?/2 + 3v(K)/ p1)]/e. 


PROOF. We first prove the corollary when (X,) are symmetric random vari- 
ables. Let {€,} be.a Rademacher sequence, which is independent of {X,,}, and 
let E, be the corresponding expectation with respect to the probability space 
of {€n}. Let E be the expectation in (Q, u). Fix « € Q. We apply the second result 
of Theorem 3.1 to the independent sequence {X,,(w)é,}, in order to obtain 


l 
m l p/2 
< (Cp 12? (loom + D Y dos) 


k-—n4-l 


I 
SO Xk(o)ex fe(x) 


n<l<m xcK bendi 








E. E max 


< (C5/2)? (log(om + 1)? E IXO), 
k=n+1 


where Cp may be computed from Theorem 3.1. 
By taking the expectation E in the above inequality, we obtain that 


| 
< (Cp/DP (log(om +1)” Y^ E|X4IP. 
k=n-+1 


Y Xxkek f(x) 


kzzn4-l 








EE, E max 
n<l<m xcK 


52 G. COHEN AND C. CUNY 


Since {X,,} is symmetric and independent, and is also independent of [2,], the 
sequences (X525) and {Xn} are stochastically equivalent. So, we have 


| 


< (C5/2)  (log(o» +1))?* Y^ EXP. 
k-—n--1 


Y Xy fk (x) 


<l<m xeK bni 








e| max max 
n 


This proves the corollary for the symmetric case. 
Now, let (X7] be an independent copy of {Xn}, defined on (Q, u’), and let E’ 
- be the corresponding expectation. Clearly, the sequence (X, — X7) is a symmetric 
sequence, so we apply the first result of the proof, for the symmetric case, 1n order 
to obtain 
] 


 (Cp/2 (logon + D)” )» EE'|X, — X} P. 
=n+ 1 


Y (Xx — Xz) f(x) 


EE’ | max max 
& k=en+1 


<i<m xeK 








Using Jensen's inequality and the fact that E’(X,,) = 0, we have 
Í l 
l 
Y. (Xe — X) fa) 


k=n-+1 
Í 





! | 
Y^ Gu - XD fo) 


EE’ | max max 
k—n-Fi 


n«lzm xeK 





n<l<m xeK 


-E| max max E’ 


l 








l 


Y. Xf) 


k=n+1 


j 
n 

< (Cp/2)? (log(om + 1)? Y^ E'E'|X, — XL? 

kzzn-4-l 








> Ei max max 
n«lxm xeK 
Hence, 


Y Xy fk (x) 


k=n+1 








E| max max 
n<l<m xeK |, 


m 
< 2?(Cp/2)? (logo, +) Y^ EIXel?, 
k=n+1 


and the result follows. O 


RANDOM TRIGONOMETRIC POLYNOMIALS 53 


NOTATION. Let s > 1, and consider the Euclidean space R*. We denote by 
boldface, for example, t = (t1, ..., ts), a vector in R?. For any t, u € R* we de- 
note by (t, u) = tiu] ++- + f;u; the inner product in R^. We recall our notation 
C^ = MaX1<n<m Cn, for any positive sequence {cy}. 


COROLLARY 3.3. Let 1 < p <2, and let {Xn} C Lyp(Q, p) be a sequence 


of complex centered independent random variables. Let hy = (QU a AB bea 
sequence of vectors in R? , and let T > 1. Then there exists a positive conseant Cp; 
which does not depend on {Xn}, such that for every m > n > 0 we have 


l 
y^ x,ei 0o 




















max max 
n«Izmtel-T,T) k=n+1 E p(t) 
1/p 
p =] k=n+l 


PROOF. Let K =[—T, T]*, and let v be the Lebesgue measure on R*. By Ex- 
ample 2.1 the sequence {ef V?) forms a {m5 TE (1A&]* + 1)}-system on (K, v). 
Colorally 3.2 yields the result. L 


Now we present further applications of Lemma 2.3. Let {Xn} C Lp(u), 2 < 
p < oo, be centered random variables. For any m > n > 0 define 


m m i-i 
RP) = Y^ XZ + Y SOX EF lly. 


iz=n-+ | i=n-+2k=1 


In the case of unbounded random variables, we can say the following: 


THEOREM 3.4. Let2 < p < oo, and let {Xn} C Lp(Q, u) be a sequence (not 
necessarily independent) of complex centered random variables. Let {fa} be a 
lon)-system on (K, v), with corresponding constants pı and p». Then for every 
m >n > Q0 we have 




















à X «C 1/p./ p(G» 
M iE MAGO s Cp mex, mal fe Glenn)! V Rim. 
1—p/2)/ py-1. 
where Cp = T$ GNE - (1 — 20-p/ és 


PROOF. The proof starts as the proof of Theorem 3.1. We use Dedecker’s 
inequality to obtain 


lS, Gp € max max | fe(x)|-¥8pR, 
n<k<m x€K 


54 G. COHEN AND C. CUNY 


for every m > n > 0. Then, using Móricz's inequality [26] we obtain 


2 1 20-2/2/p71 [gp RP) 
s max max|fiGo)l( ) V8pRim 


By Lemma 2.3, applied to the function u +> u?, we obtain the result. O 














Inax [5.1 (x)| 
n<l<m 


REMARKS. 1. Using remark 4 after Proposition 2.6, for p = 2 we obtain 


max max 
n<l<m xeK 


l 
XO Xiha) 


k=n+1 




















2 


<Czlog(4m) max max |fe (x)| (o) y Rin. 


s 4 
where C5 = ACY 


2. Let (jx) be a sequence of integers, and let fi (t) = e'/*^. Using the Cauchy- 
Schwarz inequality one can see that for every w € Q we have 


m 
<C jm J |Xxy(o)?. 
k=n-+1 


This illustrates the limitation of the method when p = 2. 


l 
2, Xe(w)e 


k:n4-1 


max max 
n<l<m te[—z,7) 








THEOREM 3.5. Let {Xn} C Loo(Q2, u) be a sequence of complex centered 
random variables. Let {f,} be a uniform {o,}-system on ((K,, v)). Assume that 
for some q > 1, we have (1/v(K,)] € £24 and {1/on} € £4. Then there exists some 
positive constants € and C, independent of {Xn}, such that (with 0/0 interpreted 
as 1) 








sup max sup exp 
m»1 Üzn«m r>] 


, ,maxxex, | Deeg Xi feo | 
Rn,m log(v(Ky)om + 1) 








Lig) | 


Hence for a.e. œ € Q we have 


sup sup M Iu XO) fe GO? 
m-hry] Rn,m log (Kr)Om + 1) 


PROOF. For any o € Q put Sn,m (©) = Ys 1 Xk(0) fk (x), and when œ 
is not specified put 5, m(x). By assumptions {v(K,)} and {o,,} tend to infinity, so 
without loss of generality we may and do assume that log(v(K,)o,, + 1) > 1 for 
every m,r > 1. This assumption reflects only in the values of £ and C. | 
The sequence (f,) forms a uniform (o,]-system, hence for every r > 1, the 
sequence {fn} forms a {o,}-system on (K.,v), with corresponding constants 


RANDOM TRIGONOMETRIC POLYNOMIALS 55 


pı and p2. For every r > 1, we apply Theorem 3.1 with (K,, v) and p1, p2. Hence, 
there exist universal constants C1, € > 0, such that for every m >n > 0 we have 


max. S, s G)I* 
exp: xc€K, | n,m( )l | < Civ(K,)om, 
Rn,m Li (u) 


for any r > 1. Hence for any m > n > 0 and r > 1, we have 














MaXreK, |Sn,m(x)|* 
Rn,m 














"a 


— (2q + 1) log((K,)on)| T 
(u 


(1) , 
€ Ci (v(K;)om) n 


For any m > n > 0 andr > 1 put 
In,m,r = lo eg:g. ie Sn, m G (9)? > (2q + I) Rum log(v(K,)om F ). 
Using (1) we have 


maXxex, |Sn,m(x)|" 
Rn m log(v (K,)oq + 1) 











dos (2q T DIL, 





exp[: 
Li(u) 


«Jette n 








<| maXreK, Sino 








— Qq + | lu... ] 


Rn,m log(v(Ky om + 1) 1) 
Ci 
ee Yeu 
(v(K;)om)74 
Hence 
m-1 2 
S, 
ve ws: x marek, Sam yg 4 Diam, 
nz: r>] Rn, m log(v(&, )9m T 1) Lio 














mCi 1/VEK I, 
wa aa a OR... d 
i (Om)? 


By. assumption, the nonincreasing sequence {1/o,} is in £,. Hence by Kro- 
necker’s lemma, SUPm>1 (77) < oo. We deduce 
n 


co m-—] 














maxsex, [Snym(x)I? | 
2 L2. 2. Pez Rn,m log CK ,)0, + 1) a HLG) 
2 
where C; = C1 |[1/v(Kr)llef, * Sups 1075] - 11:312, - So, 
maXxex, |Snjm(x) |? | 
sup max supexp{ e: ——— ———.— —— — (2¢ + Din, <C>. 
s O<n<m ied »: Rn,m log(v(Ky)om + 1) C Y D pns, Lip) 














56 G. COHEN AND C. CUNY 


But, if w £ Im,n,r for some m > n > 0 andr > 1, then 


maX;ex, |Sn,m (x)|* 


6 2g + 1. 
Ra mlog(v(Ryom +1) ^ 4 
SO, 
IDaXy ek, |Sn zie 1 
(7) sup max sup ex [eu el < CoeA* . 
malaen 4 Ramlog(v(Kr)om + D J irig 














This proves the first assertion, so in particular the integrand is finite a.e., which 
is the second statement. [J 


REMARKS. 1. The quantity e and C in the above theorem can be completely 
computed by Theorem 3.1. Note that C, = 3/p; in the proof of Theorem 3.5. 

2. The conditions {1/o,,} € £+, for some r > 0, together with nondecreasingness 
of {on} (by definition) is equivalent by Kronecker's lemma to o, > Cn®, for some 
à > 0 and for every n > 1. Since the assertions of Therem 3.5 are not affected 
by removing finite numbers of pairs (m > n), we could replace the assumption 
(1/05) € £o, by On = Cn?, for some 8 > 0 and for every n > 1. 

3. If (1/0o,) is not in £, for any r > Q, then {fn} is still a {max{n, o,,}}-system. 
With respect to this system, C in the above theorem is independent of {on}. 


COROLLARY 3.6. Let (X4] be symmetric independent complex valued ran- 
dom variables on (Q, u). Let { fn} be a uniform {o,,}-system on ((K,, v)). Assume 
that for some q = 1, we have (1/v(K,)] € £o, and {1/on} € £4. Then there exist 
some positive constants € and C, independent of {Xn}, such that (with 0/0 inter- 
preted as 1) 


sup max supexp 
m>1 QO<n<m r>1 


le ., InaXxex, | Ueki Xy fk Co) 
log(v(&,)om + 1) og ua dX 














Lip) | 


PROOF. Let {e,} be a Rademacher sequence which is independent of {Xj}. 
Let Es and E be the corresponding expectations in the probability spaces of {¢,} 
and {X,}, respectively. For a.e. o € Q, the sequence {X,(w)é,} is a sequence of 
independent bounded random variables. So, we may apply Theorem 3.5. Hence 
there exist some positive constants e and C, which are independent of {Xn(@)Eén}, 
such that for a.e. œ € Q we have 


f: IDaXxyckK, POITE, Xr loek fN? | <C 
log(v(Kr)om + D Eg-np1 Xrlo)ekl 1 


By taking the expectation E we have 


fe. maXxek, | gong 1 XkEK fk? ! pe 
log(v(K,)0om + 1) D add IXyex? J] 


E; | sup max supexp 


m>19Sn<m p>] 


EE. sup max supexp 


m>|VSN<m p>] 


RANDOM TRIGONOMETRIC POLYNOMIALS 57 


Since (X4) is symmetric and independent, and is also independent of {€n}, the 
sequences [X,2,) and {X,,} are stochastically equivalent. Hence, the assertion of - 
the theorem follows from the above result for {X,é,}. O 


For the next results, we will specify the {o,,}-system.- 


COROLLARY 3.7. Let (X4) be symmetric independent complex valued ran- 
dom variables.on (Q, u). Let X, = aP, TE AG be a sequence of vectors in R^. 
Then there exist some positive constants € and C, independent of {Xn}, such that 
(with 0/0 interpreted as 1) 

y Xe! FO est) ) 
k=n+1 


S 
x (108 c.r mi EIQAI* +1) + ] 


i=l 














sup max sup eol: (aeg Y a a 


m>|VSn<m T>} 


E] 


k—n4-1 








where C; = (12sy'. Hence for a.e. œw € Q we have 


NE rd T a ini RE 
M>nT >I] log[ C, TS . m? - s (I + D41. ae TT 


PROOF. For every T > 1 and t € R?, put Kr —- [—-7, TF and f, (0) = ei Art) | 
By uniform continuity, the measurable function maxtep. 7,7 [275444 X yel Akt) | 
is a continuous function of T. Hence, the suprema over T > 1 can be taken as 
suprema over the rational numbers. So, the integrand is measurable. 

As noted in Example 2.2, the sequence of functions f, (t) = e! Art) forms a uni- 
form ((6s)*m* LE. (A |* + 1)}-system on ((K,, v)]72 ,, where v is the Lebesgue 
measure. Clearly, (1/v(K,)) € £5. With the above settings, Corollary 3.6 yields the 
results when the suprema over T is taken over the natural numbers r > 1. Now, for 
any real T > 1, with r = [T] the integral part of T, we have 


maxte[—7,T}5 | feng 1 Xelo) Pe? |? 


log(CsTS -ms -TE (ag? + D - 1] Da 1X)? 


mMaxte{—r—1,r+1}° | rans Xk(0)e Ae? |? 


"oS 
log[Cs(r + 5 m5 - TA (aw I* D 40-21 Xe)? 


REMARKS. 1. Theorem 7 in [37] is a one-dimensional version of Corol- 
lary 3.7 in the periodic case. More specifically, take s = 1, and {A,,} a nondecreas- 


58 G. COHEN AND C. CUNY 


ing sequence of integers. Of course, in that case the suprema over T > 1 is re- 
dundant. Recently, Weber proved Theorem 7 without monotonicity, and under the 
condition A* > n?, 6 > 0 (see also remark 2 after Theorem 3.5)— personal com- 
munication. Weber's proof is completely different; it is based on the Dudley met- 
ric entropy method, and uses the Borell-Sudakov—Tsirelson inequality (see [22], 
Lemma 3.1, page 57). 

2. By using the reduction principle, one can deduce from Theorem 6 in [16] a 
weak one-dimensional version of Corollary 3.7. It gives usual integrability (i.e., 
without the exponential of the square), without the suprema over T. The proof 
of [16] uses a general Gaussian inequality of [18]. Their results hold for centered 
independent random variables under a quite restrictive condition. 

3. As we will see in Section 5, it seems that in applications to random ergodic 
theory the suprema over T > 1 is important. 

The theorem below shows that the general centered case holds modulo some 
slight modifications which have no impact on our applications. 


NOTATION. For any real numbers a and b we put a v b = max{a, b}. 


THEOREM 3.8. Let 1 < p <2, and let {Xn} C L5(&2, u) be a sequence of 


complex-valued centered independent random variables. Let X, = (A Cp TNT a) 
be a sequence of vectors in R°. Then there exist some positive constants £ and C, 
independent of (X4), such that (with 0/0 interpreted as 1) 


| maxi 7,7 | Dpenp1 Xie 9p | 
log(T5 - Ym + 1) - pings Xal? + EXP? 


$ 














sup max sup exp] € 


m>10sn<m T>] 1 ~ 


where Ym = (12s)! -m? -[ i. (A I* - D. In particular, for a.e. w € Q the quantity 


| 


x ( log(T + 1) log| m v max (as? [* 41) + 1 
«i «s 


m 
»» X; (we! 0 
k=n-+l1 








sup sup max 
m>nT >| te[—-T,7]}§ 


m l/p, ~1 
(X XD xal | 


kzn-F1 
is finite. 
PROOF. Let(X,) C L5(€?', u^) be a sequence of independent copies of {Xn}. 


Clearly the sequence (X,, — X/,} is symmetric independent on (Q x V, y & u^), 
and hence Corollary 3.7 applies. So, for y & pw a.e. (o, &) € Q x C, there exists 


RANDOM TRIGONOMETRIC POLYNOMIALS 59 


a positive constant C (c, o), such that for any m > n > 0, and any T > 1, we have 


5 (X: (Œœ) — X, (co ye st) 


kzzn4-1 








e Ty 
1 
a) m 1/2 

< C(o, a) J log(T* - ym + 7 Y Xo- xi er) l 
k=n+1 
Furthermore, for some universal constants £ > 0 and C’, independent of {Xn}, we 
have 
| ple (Co, Plu du do!) « C'. 


Let E and E’ be the corresponding expectations in Q and Q’, respectively. By 
taking the expectation E’ on the left-hand side of (1), we obtain by Jensen's in- 
equality and E/(X;) = 0, that for a.e. w € Q we have 


|y (X;(w) — Xi Je Me ,t) 


kzn4-l 


y Xy (uel 09 f. 


k=n+1 


By taking the expectation E’ on the right-hand side of f(D, and using: (1) || - lle < 
ll - le; Gi) Hólder's inequality; (iii) (la| + |b|)? < 2P- (|a|P + |b|P), we obtain 


m 


32 (Xlo) — X;)e Me? 


k=n+1 








E/ | max 
te[—-T,T]S 


> 
te T, am 








> 
re ^ y 








1/2 
Ec, log(T* m t D ^ a.) - xi) | 


—nl 


m l/p 
< (loam DE cto Y Pe ao) - xi^) ! 
k 


—n4l 


x Jlog(75 -Ym + 1) (E'[C(o, jp Dye Dip 
m l/p 
x ( > exse) - xti) 
k=n-+1 


< Jlog(Ts ‘Ym +1) (E [C (o, joy , 2(P—1)/p 
m 1/p 
x b [Xx (@) |? «Ege i 


=n+1 


60 G. COHEN AND C. CUNY 


By combining that with the previous computation we obtain 


2 
E Xy (o)e Oe" | 


k-—n-4-1 








Sup Sup exp4e- 
Hips » E TTE 


x (220-0 log(75 - ym + 1) 
(7) 


m 2/p. —1 
«( D Xeo El | | 
k 


—n-l 
< exp[e - (E [C (o, j/G-DyHte-brpy. 
Define the function ¢,)(u) = exp (u??-U/P), which is convex in the interval 


[Kp, oo), where Kp := = (m 2(P—U. Using Jensen's inequality and using (a + 
b)* « a? +b“, for a, b= Uand se < 1, we have 


exp[e - (E[C(a, )]?/(?-»)*07r| 
= Pp (E'|e! ^ C(o, [p/ G1) 
< $p E [Ie ^ C (o, )|P/ 79 + Kp) 
8) < Eo, (Ie? -C(@, P/P? + Kp)] 
= E' exp{(|e!/* - C (o, )]P/ (97D + Kp) PP) 
< F exp{e - [C(c, JP 4 K20- D/P} 


2—p / 2 
= exp| 5 xp mu E expte -[C(@, -)]*}. 


By taking the expectation E in (8) and using the fact that EE’ exp{e x 
[C(-, -)]*} < C’, we obtain the first result with C = exp{ 5-4, 2554 - C’ and by chang- 


ing the value of e to &/22(? -U/P, 
The second result follows from the first assertion using the inequality 


s 28 
(i) (i) |* 

n [LT (lA? |" +1) < ) < |m v max (JA? +1)| : a 

4. Convergence results. In this section we give general convergence results 
for sequences of Banach-valued random variables. Then we apply these results 
with our previous uniform estimates, and we obtain a.e. uniform convergence of 
random series of almost periodic functions. 

Let B be a separable Banach space with norm || - ||. Let (2, 2) be a measure 
space, and let E be the corresponding expectation. Let X be a random variable 


RANDOM TRIGONOMETRIC POLYNOMIALS 61 


on Q with values in B. The separability assumption of B is made in order to avoid 
measurability complications. For 1 < p < oo, we denote by || X1], the quantity 
(E||X||?)!/?. The Banach space of all random variables with finite || - | p-norm is 
denoted by L5(€2, u; B) or simply by Lp(u; B) [or L5,(B)]. When B = C (or R) 
we just write L ($2, u) or simply Lp. 

Let K be a compact metric separable space. In our applications we will take 
B=C(K), the space of continuous functions on K, with the norm 
| fll = maxxex |f (x)| for f € C(K). 


THEOREM 4.1. Let {Xn} C Lp(Q, u; B), with 1 < p < oo. Let {æn} be a 
sequence of nonnegative numbers. Let y and C be positive constants, and as- 
sume there exists a positive nondecreasing ( possibly constant) sequence {An}, with 
An £ Cn”, such that for every m > n > 0 we have 


m P m 
Y; X |<4m X Oe. 


k=n+1 k=n+1 





(7) e| 











FEC An An (logn)? converges, then the series $7, X, converges almost every- 
where and in Lp(Q, u; B). Furthermore, we have 


n 


Xx 


k=] 


sup 
n>] 


oO. l/p 
< 2el/P[1 + p(P79/P (p + y)] (È On Án dog?) 


n=] 


























p 


PROOF. By remark 4 after Proposition 2.6, we have 


l 
2, Xi 


k=n+1 


P m 
< Am (log, 4m)? ` Ok, 
p nl 


max 
n«lxm 


























for every m > n > 0. If we restrict ourselves to m > 2, by changing {An}, we may 
replace the log, 4m by the natural logarithm log m. 

We define a sequence of integers {xn} as the following. Let x; be the first integer 
for which A,, (log x1)? > e. For every n > 2, we define inductively 


Kn41 = max(m > Kn +1: A&(logm)? < eA,, 41(log(s + 1))? ]. 


Clearly, the sequence {xy} is strictly increasing, and for every n > 1 we have the 
following properties: 


(i) Ag, a 00g kn41)? S €Axg,+10g(kn + 1)? < Ag, a +1 dog (kn41 + 1))?; us- 
ing (i) we have 
(ii) Akn € @Ac +1 and A,,41(log(kn + 1))? > e"; by the assumption 
An < Cn” and (ii), we have 
(ii) (p+ y) log(k, + 1) z n. 


62 G. COHEN AND C. CUNY 


(a) Using (7), (ii) and (iii), we obtain 














3 P 0O Ky--1 
J. Dr È Xe] dus YAna Y o 
k=ky+1 v=l k=k;+1 


OQ 
« e(p +y)? ) ,a As (logn)? < oo. 


n=l 


Hence by Beppo Levi the integrand ?^7* , v? || 2s 41 &kl|? converges almost 
everywhere. 
(b) For any naturals r and m we obtain, using Hólder's inequality, 


























Kay p m-+r— 1 l Kuti 
S oa <("S d E d) 
k=km+1 v=m key 4-1 
00 1 p~l oo Ky+l 
< ree HR. y? X 
= (È mn) 2. 2 k 
v=m =Ky+ Í 














The first factor in the right-hand side converges to zero as m — oo, while the last 
factor converges a.e. by (a), so Do Xx} is a Cauchy sequence a.e., and hence 
converges a.e. By taking integrals of the above inequality, and considering the con- 
vergence proved in (a), (57;" , Xx} is a Cauchy sequence in L, (C, m; B)-norm, 
and hence converges in norm. 

(c) Using (7) and (1), we have 














Km+1 
Y f. om cree X Xk LI > Ok 
m m+I k=km+1 | bcr 4d 
00 
<e Y os An (logn)? < oo. 
n=l 


Now, (b) and (c) imply that X°; X, converges a.e. in B to X := 
limy. o0 977" , Xn, since for km « n < Km+1, we have 









































n Km n 
2 Xr- X| <| X-X|+| 2 Xe 
Km n 
< X X-X + max Xg. 
Km <N Km] 
kl k=Kn +1 























By considering the norm convergence proved in (a) and (b), the 
L,(Q, u; B)-norm convergence follows by taking the L,(&, 4z)-norm in the 
above inequality. 


RANDOM TRIGONOMETRIC POLYNOMIALS 63 


Now we will prove that sup,..; || * 77..4 X«ll € Lp(u). The inequality in (b) with 
m = 1 yields 


p 


























Kr+1 oo 1 p~l oo Ky+1 
su X < ey MERE IS v? X 
A 2, Xk < «(5 za 25 »3 
rZ5]k—e 4 v= v= k=ky+1 


Integration of the above inequality and application of (a) yield 











Kr+] P oo 
Q) fup) 35 xi| dux p" e(p +y)? Y jos As dogm? «oo. 
rzi k=k;+1 n-i 





The inequality in (c) yields 






































n P 
su X E < J? X d 
J so, ma bá I : m 2 | m Po c Z, k H 
(7) N 
Se ? as An (logn)? < OO. 
n=l 
Using (1) and (7), we have 
n p l/p 
[wp] Oo x. an) 
n1 kaet] 
oO l/p 
n=l 


Since a.e. we have 


Xx 


k=1 


sup < Xil + (Xall+--- 1X] sup 


n>] 


$ 


























Xx 


Vil kc 4-1 


we have 


Y 


k=l 














p 1/p 
(/ sup in) 
n>] 


€ Ag (01 +--+ + o) 


oo I/p 
+ eVP[1 + pP-Y/P(p + y)] p anAn ogn) ) , 


n=l] 


So, we obtain the maximal inequality. |] 


64 G. COHEN AND C. CUNY 


REMARKS. 1. Theorem 4.1 uses techniques of Theorem 2.4 in [8]. In [8] the 
theorem was obtained without the condition A, < Cn” (which is not restrictive in 
our applications), but under the assumption $ p<] Azndn(logn)?. 

2. If (7) holds for {Am} bounded, then the condition 377^ , o, < œ is not suf- 
ficient in general for the a.e. convergence (see Menchoff's example [25], Theo- 
rem 3). 


THEOREM 4.2. Let {Xn} C Ly (Q2, u; B), with 1 < p < oo. Let {œn} be a se- 


quence of nonnegative numbers. Assume there exists a positive nondecreasing un- 
bounded sequence {An}, with A, > 1, and some positive constant q > 1, such that 


for every m > n= 0 we have 
P m q 
k=n+1 : 


y in| X a) 


n=] (k: 2^ «A, x20] 








i 
Y x 


k=n+1 


(8) | e| max 


n«lzm 








If the series 


or in particular 


ta (2 2Àk >n} a yt P 


2 n(logn)!-VP 


n2 


converges, then the series $79, X4, converges almost everywhere and 
in Ly(&2, u; B). Furthermore, we have 


n oo s/p 
Y: Xx «aye Y a f 
k=1 } 


p nal (k: A22" 


























sup 
n>] 


PROOF. Define two sequences {kn} and {l,} by induction. Let x; = 1, = 0. 
Let 1,4; be the integer such that 2h < Ax,+l < Qn+itl Then define Kai = 
max(m > kp + 1:25 < Am < 2h). Clearly, x, is finite by the unboundness 
and nondecreasingness of {An}; moreover, it is increasing. In particular, by as- 
sumption on A1, we have h = 0. Now, for every n > 1 we have 


m p 1/p 41 q/p 
1/p 
max X, d « A ) OL 
(/ Kg <M SKn+} 2, £ n) di TA e) 
k=Kyn+1 =Ky+1 














q/p 
« 2n D/p | y ac) , 
2ln41 XAL «2l rt 


RANDOM TRIGONOMETRIC POLYNOMIALS 65 


Hence, 
oo m oo qip 
) max ) Xll x2 ) al ) ar) l 
Kn «IS Kg kl 
nz-l k=Kkyt+1 p n=0 2h c AL «201 


























This proves the a.e. convergence. 
Let x, < m € Ky41. We have. 






































K]14-] l 
2 Xz < Y ) Xi + max Aki. 
Kn +1<l<kn+1 
l=14Kj+1 kK +1 


The maximal santa then follows from the previous computations. [| 


REMARKS. 1. For g > 1, using remarks 1 and 2 after Proposition 2.6, we 
could assume the following weaker condition instead of (8): 


m 
2, X. 


k=n+1 


(9) 














P m q 
sa p» «| for every m > n> 0. 
p kzn4-l 

2. Clearly, u (sup ;>1 || yx -a41 Skil = £) - Qis equivalent to the u-a.e. conver- 


gence of ?7* , Xn. Hence, if for some 1 < p < oo and 1 <q < oo, (8) holds for 
every m > m > 0 with {A,} bounded, then the condition ?7? , a, < co is sufficient 
for the convergence result of the above theorem. Moreover, we have 


Eme E-] 


k=l k=1 














(10) e| snl 3 

nzl 

3. Using remarks 1 and 2 above, the condition ) 7? , an < co is sufficient for the 

convergence result of the above theorem under (9), with g > 1 and {A,,} bounded. 

In that case (10) holds by multiplying the right-hand side by factor (1 — xn ? 
(see also [8], Theorem 2.5). 


4.1. Convergence of random almost periodic series. Let {Xn} C Lol, u) 
be a sequence of complex centered random variables. In the following we will use 
the notation: 


i 
= ||XE(X; |F) llo for every i > 1. 
k=] 


THEOREM 4.3. Let {Xn} C Los(&2, uU) be a sequence of complex centered 
random variables, and let 2 < p < co. Let ( fn} be a (o4]-system on (K, v). If the 


series 
1/2 
Y (DL ik: a doge PP 5 9) Ok) / 
n(logn)!-» 


n=2 


66 G. COHEN AND C. CUNY 


converges, then for ae. w € Q the series 5-4, Xy4(w)fk converges 
uniformly on K. Furthermore, it converges in Ly,(Q,u;C(K), and 


SUPp>1 maXex | pi Xa fie(x)] is in Lp (u). 


PROOF. Put B — C(K). Using the second part of Theorem 3.1, up to appro- 
priate constant, we obtain (8) with p > 2, q = p/2 > 1, An = Cy (log(o, + 1))P/2 
and (oj) as defined above. Theorem 4.2 yields the results. O 


THEOREM 4.4. Let {Xn} C Loo(&2, u) be a martingale difference n o 
Let X, = a OD, ii , A6) be a sequence in R*, and put yy = k v maxizi«s PS ^ 
If the series 


Y Ou: uon) LXI ) 7 
1/2 

mr A (log n)!/ 

converges, then for every T > 1 and for almost every œw € Q, the series 

X3 I Xy (c)e' V» converges uniformly in t e [-T, TP. Furthermore, it con- 

verges in Lo($2, u; C([—T, T] )), and | 


su Y Xo Ont 


m e tT Tr 








is in Lalu). 


PROOF. Let K =[—T,T]’, and let v be the Lebesgue measure on R?. By 
Example 2.1 the sequence (e/U9] forms a ((n v maxt<j<s 1A [525]. system 
on (K, v). With the above settings, Theorem 4.3 with p — 2 yields the assertions 
of the theorem if 


> (Stk: yn!) XK SG) 7 
n(log n)1/2 


n=2 


By a change of variable in the series above we obtain the condition of the theo- 
rem. O | 


THEOREM 4.5. Let {X,} C L2(82, u) be a sequence of centered independent 
random variables. Let Àn = GY) jas.) An’) be a sequence of vectors in R5, and 
put yk = k V maxi<j<s Pid If the series 


T D (uon) HX)? 
en n(log n)!/2 


n=2 


RANDOM TRIGONOMETRIC POLYNOMIALS 67 


converges, then for every T > 1 and for almost every o € Q, the series 
30 e X, (w)el VS converges uniformly in t € [-T, TJ. Furthermore, it con- 
verges in Lo(Q, u; C([- T, T )) and 


n 
k=] 


sup max 
nz1tel-T,TE 








is in Lalu). 


PROOF. By Example 2.1 the sequence (eU?) forms a ((n v maxj<j<s 
at? |*)25}-system on ([—7, T], v). Using Colorally 3.3 we obtain inequality (8) 
with p = 2, q = 1 and A, = 2sCp log yn. Theorem 4.2 yields the result, as in 
Theorem 4.4. L] ! 


REMARKS. 1. The a.e. uniform convergence of multidimensional almost peri- 
odic series with square integrable symmetric independent coefficients was consid- 
ered in Marcus and Pisier [23], Chapter VII, Section 1. Their proofs use the metric 
entropy method. Using the metric entropy method one can obtain a more precise 
condition than a 1). Our condition is the same sufficient condition as it is implicit 
in [23], Chapter VII, Lemma 1.1. In any case, Theorem 4.5 completely recovers 
Theorem 5.1.5 of [34]. 

2. In [24] the a.e. uniform convergence of multidimensional almost periodic 
series with i.i.d. p-stable coefficients was considered. As it is implicit in [24], Re- 
mark 4.4 (use [23], Chapter VII, Lemma 1.1 and then [24], Theorem B) the result 
is valid for symmetric i.i.d. coefficients [not necessarily p-stable or in L5 (u)]. 

3. Extending the results beyond the scope of symmetric random variable is pos- 
sible, under various conditions, by a symmetrization procedure (see, e.g., [5]). 

4. Let 1 < p < 2, and let {Xn} C Lp(w) be centered independent random vari- 
ables. We could use our previous results in order to obtain a.e. uniform conver- 
gence also in this case (i.e., for p « 2). In that case our derived sufficient condition 
is not as good as given in [24], Remark 4.4. 


5. Applications to ergodic theory. Let (Y, 3,7) be a probability space, 
and let Vi,..., V, be pairwise commuting isometries on L2(Y,2). For every 
j: G™,...,7) e NS and f e H := L2(¥,7), define the action 
Vif = V E V " f. Using the dilation theorem for pairwise commut- 
ing family of isometries (see, e.g., [28], Chapter I, Proposition 6.2), there exist 
a Hilbert space H’ D H and a family U1,..., Us of pairwise commuting unitary 
operators on J£", such that for every j € N° and f € # we have Vif = P3UÀ f, 


TA ; ;Q) 
where Py is the orthogonal projection of #’ onto H and U?f = Ul ! o 


;(s) 
o Ug f. By the spectral theorem for the unitary representation (U*:k € 


68 G. COHEN AND C. CUNY 


Z^) (e.g, see [31], Chapter X, Section 140), there is a positive measure uy 
on [—z, x)", called the spectral measure of f, such that for any k € Z? we have 


OA fe et dus, 


where (k, t) denotes the inner product in R5. 
Let {a} be a sequence of complex numbers, and let {jx} C N°. Using the dila- 
tion theorem and the spectral theorem, for every n we have 


Px p o Ut "i 


k=] 


n 
y. a, V f| = eso]. 


k=! 


(12) 


























n 
< max 
lf es 2 








2 


REMARKS. 1. Let 7; and T? be commuting contractions on J£. One can 
consider Ando's unitary dilation for pairs of contractions ([28], Chapter I, The- 
orem 6.4). Specifically, there exist two commuting unitary operators U; and U>, 
acting on H’ D H, such that for every natural number n and m we have TY T7" f = 
P4UT Uy f. In the case of two commuting contractions, (12) is still true when 
Vj are replaced by 7;. 

2. Parrott [30] gave an example (see also [28], Chapter I, Section 3) which 
showed that the dilation theorem is no longer true in the case of more than 
two commuting contractions (for an analogue for commuting Markov operators 
see [15]). 

3. For more than two commuting contractions, additional conditions should be 
added in order to obtain a regular dilation (see [28], Chapter I, Theorem 9.1 or a 
theorem of Brehmer [27], Chapter 6). A simple condition that one can assume on 
a family of contractions is the pairwise doubly permutability (where doubly stands 
for commuting also with the conjugate); see [28], Chapter I, Proposition 9.2. 


NOTATION. Forj=(j",..., j 9) € ZZ, put jj] = max{[j |, ..., |j ?]). We 
recall our notation a V b = max{a, b), and c7, = max1<n<m Cn for a positive se- 


quence (c4). If jn = ( iP, TT is) is a sequence of vectors, then by our notation 
we have 
max max max 
Mim I" = 1x X Lim T= i<n<m max | jn l- 


THEOREM 5.1. Let {Xn} C La(Q, u) be a sequence of centered indepen- 
dent random variables, and let {jn} C N°. If the series Y2; ||XnllSlog(n v 
lin|*) dog n)? converges, then there exists a set of full measure &2* C Q, such that 
for every œ € Q'*, for every commuting family of isometries Vi, ..., V; on a space 
L(Y, z) and any f € Lo(x), the series 


oo 
` Xn(w)V™ f converges 1 -a.e. 


n=l 


RANDOM TRIGONOMETRIC POLYNOMIALS 69 


Furthermore, for every w € Q* there exists a constant Ko < oo, independent of f, 
such that 


n 


< Koll f ll2.- 
2 


sup 
n-1 











Xy (0) VÀ f 
1 











k= 


PROOF. We construct from {X,,} two sequences of random variables in the 
following way: 


Y te Xn, if |ja|* < e", 
^ 10, otherwise, 


and 


z, — ] Xn if ljn > e", 
$ 0. otherwise. 


Hence X, = Y, + Zp. 
Let {nx} be the sequence of integers for which |j, |* > e"*, that is, the sequence 
for which Zn, is not null. Notice that for every w € &2, we have 


oo OO 1/2 
X` |Zn(w)| x (Solem (c) |^ log(ng v b. degno 


nz: k=l 


00 1 1/2 
x d Cur a ae” ee Eq e DD NA 
(> loging V [ng l*) og =) 


(13) 
E 1/2 
< (>: IX, (w)|* log(n v ii on)? 
n=2 
(x 1 | 1/2 
x —  — MM 
5 ; 
ird n(logn) 
By our assumption and the theorem of Beppo Levi, for u a.e. w € Q we have 
OO 
(1) (Xn)? + [Xn ll5) log v lj.) dogn)? < oo. 
n=l 


Hence, by (13) and (1), it remains only to consider the series 3:9? , Y, VÀ» f. 


Since Y, is null when |j,,|* > e", we may and do assume from now on (modi- 
fying {jJn} when necessary) that for every n > 1, we have |j,|* < e". 


70 G. COHEN AND C. CUNY 


On the other hand, by the second assertion of Theorem 3.8, for u-a.e. o € Q 
there exists a constant Ce, such that for every m >n > 1, 


2 


m - ja 
Y. Yi(oye de? 
n4-1 


sup 
te[—7,z:)* 








(7) 


m 
< Cologn V liml* -- 1). Y (Yelo) + NYl. 
kzn4-l 


Let Q* be the set of w for which (1) and (7) hold, and fix o e Q*. 

Using (7) together with the spectral theorem [see (12)], we have for every m > 
n 7 0, 
2 


Hn 
< [PIC login v ljm* +1) 7. Yelo) + lY. 
A k=n+1 


Y. Ye(o) VE f 


kz==n-+-1 














Hence, the condition in (7) is satisfied with a, = |Y,,(w)|? + | Y,, Iia and AC cs 
lf en log(n v |j,|* + 1). Since |j,|* < e" we are allowed to use Theorem 4.1. 
Using. (1), Theorem 4.1 yields the two assertions of the theorem for the se- 
quence {Y,}. O 


THEOREM 5.2. Let {X,} C Lo(Q, u) be a sequence of centered independent 
random variables, and let {pn} and {qn} be sequences of natural numbers. If the 
series 

oo 

$ IX, loga v p; V gn) Cogn)? 

n=l 
converges, then there exists a set of full measure Q* C Q, such that for every 
w € Q*, for every commuting contraction T, and T, on a space L(Y, 7z) and 
any f € Lo(s), the series 


oO 
2 Xn (o) TP "Ty" f CORVErges N -a.e. 
n=) 
Furthermore, for every œw € &2* there exists a constant Ky < oo, independent of f, 
such that 


n 


< Koll f lia. 
2 











sup 
n>1 











Xx(o) n D “S 
k=1 


PROOF. As we mentioned in the remarks at the beginning of the section, we 
can use the unitary dilation in the case of two commuting contractions. In this case 
also (12) is still true. We proceed as in Theorem 5.1. L 


RANDOM TRIGONOMETRIC POLYNOMIALS 71 


REMARKS. 1. Theorem 5.2 extends the part of [8], Theorem 4.2, related to 
square integrable {X,,}, to the case of two commuting contractions. Moreover, the 
convergence is along certain subsequences. 

2. Using Theorems 3.8 and 4.1, we could consider the case (X4) C Lp(Q, u), 
1 < p « 2. It turns out that Theorem 4.1 does not lead to the best result that one 
can obtain. In addition to Theorem 3.8 another tool seems to be needed. It will be 
done in the forthcoming paper [6]. 


Let 11,..., t; be pairwise commuting measure-preserving transformations of a 
probability space (Y, X, zr). For any j € N° and every X:-measurable f, we define 
Tif=foti a a By assumptions, for any j € N5 the operator 7 is an 
isometry of L,(Y, x) for any 1 <q < oo. Moreover, for fixed 1 < q < oo the ac- 
tion {T}: j e N°} on L4 (Y, 1) is an isometric representation of the semigroup N’. 

As in [33] (and [7]) we want to use Stein's complex interpolation in order to 
obtain a.e. convergence results, under the assumptions of Theorem 5.1, also for 
f €L4Q,1),1zq-««2. 


THEOREM 5.3. Let (X4) C Lo(Q, u) be a sequence of centered indepen- 
dent random variables, and let {jn} C N5. If the series 3775.4 \|Xn iż log(n V 
lj. log n)? converges, then there exists a set of full measure Q* C Q, such that 
for every œ € Q*, for every commuting family of measure-preserving transforma- 


tions 1,,..., Ts on (Y, x) and any f € La (Y, n), 1 <q x 2, the series 
c Xn (0)TÀ f 
2. -".-gyu COmverges T-a.e. 


Furthermore, for every c» € Q* there exists a constant Ky < oo, independent of f , 
Such that 


y Xf 


E 20-4 


n>1 




















< Kall f ilg- 
k=1 q 


PROOF. As noted in the proof of Theorem 5.1, we may and do assume that 
lin|* x e” for every n > 1. 
By the Beppo Levi theorem and by our assumptions, for jz-a.e. œ € Q we have 


(1) y (IXa (o) + X«I) log(n v [in|*)dogn)* < oo. 


n=] 


By the second assertion in Theorem 3.8, for jz-a.e. w € Q, there exists C, such 


72 G. COHEN AND C. CUNY 


that, for every m > n > 1 and every K > 1, we have 
2 Xy (o)e Anoek oi Ge. 
k=n+1 








max 
(,t)e[-K,K]x [-K,K] 


nm 
< Co log(K + 1) log(m v $ljsl*logm +1) Y^ IXcGJ2I^ + X45. 
k—n-4-1 


In particular, for every n € R we have 


2 


max 
te[—7,:)5 


m 
y Xy (œe i Q/2nlogk ji Gest) 
kzn4-1 








(7) 


. | 
< 2C, log(nl +7) log(m v lim +1) $5. Xo)? + 1Xell3. 
kzn--1 


Let Q* be the set for which (1) and (7) hold, and fix œ € Q*. Hence, for any 
f € L(Y, x) we deduce from (7) and (12), that 














Wi 2 
Y Xy(o)e 1 0/2nlogk pik f 
k=n+1 2 
(14) : 
2 » px 2 2 
< 2C, f ll Tog(Inl +) login V lim* +1) 7 IXO + IXa. 
k—n4-1 


For any complex ¢ = é + ig with 0 € & < 1 put W, (T) :2 $z- Xk (@) x 
k-O/2€ Th, Using (1) and (14), we apply Theorem 4.1 with a, = |X, (œ)? + 
| X412 and A, = 2C, f Il log(1n| + 2) log(n V Ijs|* + 1) in order to obtain 








sup [s (T) | < Ci f login 4- zo) fla, 
n> : 


for some Cı > 0, which does not depend on 5 or f. On the other hand we have 


| UD Is iig (D) fll 


T 1/2 
(15) < (fll p IX. log(n v in") dog?) 


n=l 


: i 1/2 
x > a e G ET IRS | i 
( nlog(n v |jn|*) dog =) 


n=l] 


RANDOM TRIGONOMETRIC POLYNOMIALS 73 


For a subset A C Y let MA be the operator of multiplication by 14. For any 
bounded integer-valued function J > 1 defined on Y we have linear operators 


max I J 
VrHT)- > Mu 2 Xy(o)k Q/9* Tk. 
jal k=1 


Hence for f € L(Y, x) and y € Y we have 


Iy) 


Wre(T) f(y) = Y Xok PETE f(y), 
k=l 


so [Wr z (T) f(y)| € sup,si | pay Xe (o)k- 0/9 Th f (y). For fı and fz simple 
functions on Y it is easy to check that ®(¢) :— f Vi - (T) fi - fa dz is continuous 
in the strip 0 < € < 1 and analytic in its interior. 

Our previous estimates yield 


[nin Df ll, x Cilflaylegüni--) f eLm), 
Ir nis) fI s Collflh. f € Li). 


Stein's interpolation ([38], Theorem X1I.1.39) yields that for 1 < q < 2 we have, 
with t = (2 — q)/q, 


Iri Of < Ci3,4l f lla; f eL). 


Note that the constant C1,2, depends only on C1, C2 and q, and not on the choice 
of I or f. Keeping q fixed and taking f € L4), we define [y(y) as the first 
integer j for which 


j a 
3 Xok T# f(y) 














n 
2 —t/2 jk 
ymax |) | Xe(w)k T* f(y) 
k=l k=l 
and obtain 
n 
li X —(2—4)/Qa)rk £ 
Noe In N "» KK B j 




















ks] q 
iva, DII, < Cruzat fle 


This yields the L, (sr )-integrability of the maximal function. Furthermore, it yields 
that sup,..115 1. Xn (w)k72-9/2D Tk f| < oo z-a.e. for any f € Lg (x). Note 
that $2* is a subset of the set Q* which was defined in Theorem 5.1. So, for 
f € Lo(a) the series 377^ , Xn (w) T^ f converges m-a.e.; hence also, by Abel's 
summation by parts, 35? , X, (œn -D/C Th f converges m-a.e. Since Lo 
is dense in L4, the Banach principle yields that for any f € Lr) the series 
3:99 | Xs (on C04) Ti f converges x-a.e. C 


— 
— 


lim 
N> 


74 G. COHEN AND C. CUNY 


REMARK. In(15) we could sharpen the estimation in order to obtain a slightly 
better rate in the normalization n@-?/@@ that appears in Theorems 5.3 and 5.4 
below. 


Recall that a Dunford-Schwartz operator on L1(Y, 71) is a contraction T which 
is also a contraction of Loo(Y,7), and therefore is also a contraction of each 
L(Y, n), 1 <q < oo, by the Riesz-Thorin theorem (for a simple proof for 
Markov operators, see [21], page 65). Clearly, the operator T defined in the pre- 
vious theorem is a special kind of Dunford-Schwartz operator. 


THEOREM 5.4. Let [X4] C L2(Q2, y) be a sequence of centered independent 
random variables, and let {p;,} and {qn} be sequences of natural numbers. If the 
series 


OQ 
3: X«li2log(n v pz v a) ülogn?? 


n=l 


converges, then there exists a set of full measure Q* C Q, such that for every 
w € Q*, for every commuting Dunford-Schwartz operator Ty and To on a 
space L(Y, x) and any f € Lg(Y,7), 1 <q <2, the series 


nQ-4)/Q2) 5 B 
Furthermore, for every œ € Q* there exists a constant Ky < oo, independent of f , 
such that 


n n rmn 
sup? n G-2)/ 08) 


nz 


— 














| < Kol fla. 
q 


PROOF. Inorder to apply Stein's interpolation theorem in Theorem 5.3, it was 
needed that the operators involved there are Dunford-Schwartz. Since the proof of 
Theorem 5.3 uses Theorem 5.1, the failure of the dilation theorem for more than 
two commuting contractions restricts the present theorem to the case of only two 
commuting Dunford-Schwartz operators. Q 


6. On the Wiener-Wintner property. In a series of papers (see [2], the 
book [3] and the references therein), Assani introduced the concepts of Wiener- 
Wintner functions and of Wiener-Wintner (dynamical) systems. These families 
are connected to several deep theorems of Bourgain (e.g., the return times theo- 
rem and the double recurrent theorem). We show that the estimates obtained for 
random trigonometric polynomials allow one to deduce the Wiener-Wintner prop- 
erty for functions (on a dynamical system), which satisfy some appropriate mixing 
conditions. 

Let us recall the notion of Wiener-Wintner functions. 


RANDOM TRIGONOMETRIC POLYNOMIALS 75 
DEFINITION 6.1. Let (Q, 4, 1,0) be a dynamical system. For 0 < o < 1, 


a function f is a WW function of power type «œ in Lp ($2, u), 1 < p < co, if there 
exists a constant C y > 0 such that 


3» gy fo 0* 


Now, we have the following corollary of Theorem 3.1. 


max for every n > 1. 


tci—,2)! A 


E 
ne 




















PROPOSITION 6.1. Let (22,2, 4,0) be a dynamical system, and put Fy = 
o{f,..., f o 0^. For every p > 1, there exists Cp > 0, such that for any 
f € Loo(&2, H), 




















ikt k 
0 
p zn 33 á f ° 7 
(16) mass a | ji 
«BEN = (nui tf oe E[f o6 Filo) 
i=2k=1 


In particular, if {f o 0") is a martingale difference sequence, then for every 0 < 
a < 1/2 the function f is a WW function of order a in all L5($2, u) spaces. 


Proposition 6.1 requires uniform estimation of the correlation coefficients, 
which may look too restrictive. 
It is possible to use Theorem 3.4, in order to obtain: 


PROPOSITION 6.2. Let (Q, E, 14,0) be a dynamical system. For every p > 2, 
there exists C > 0, such that for every f € Ly(&2, u) 























ikt k 
in 3» Doo 
P 
C n i—l 172 
mp «es Ills + SY If HEL oria) l 
s2 k=] 


In particular, if {f o 0" generates a martingale difference sequence, then f isa 
WW function of power type a —1 —1/p — 1/2 in Ly(&2, u). 


Clearly, control of the correlation coefficients involved in (16) (or in Proposi- 
tion 6.2) yields the Wiener-Wintner property. Such a control is possible in many 
situations. See, for example, the discussion in page 9 of [13]. A typical exam- 
ple is provided by a Markov chain whose transition probability induces a quasi- 
compact operator (see [19] and the references therein). For results without the 
quasi-compactness assumption one can refer to [9] or [10]. 

We now give an application of Proposition 6.1 to K -automorphisms. 


76 G. COHEN AND C. CUNY 


DEFINITION 6.2 (36D. Let (Q, ¥, 4) be a probability space, and let 0 be an 
invertible measure-preserving point transformation on €2. The'dynamical system 
(Q2, 7, 1, 0) is called a K-automorphism if there exists a sub-c-algebra C, such 
that 


6'ece; (\e"e={2@,Q}; |) "Cis dense in F. 


n>1 n>] 


PROPOSITION 6.3. Let ($2, £, 4,0) be a K-automorphism, and let 1 < 
p « oo. There exists a set of functions, which is dense in LS (u)-ífe 
Lp(u): ECf) = 0], such that for every f in this set, there exists a constant Cf p 
such that 
(17) < Cy pvnlog(n + 1) for every n > 1. 


P 




















" * 
ye" fog" 
k=] 


max 
te[—75,2]],— 


PROOF. Let C be the sub-o-algebra related to the K-automorphism 
(2, F, uw, 0). By Definition 6.2 the algebra | J,..; 0" C is dense in F, hence (by 
basic measure theory) the set of functions c.J.m.(1c — &(C):C €), :,0"C) 
is dense in each L? for the norm || - ||p. By the second requirement in Defini- 
tion 6.2, as k goes to infinity, the martingale E(cI0-*G) converges a.e. to u(C) 
(see [14], Chapter VII, Theorem 4.3). So by the bounded convergence theorem, 
1c — Edcio ^ 6) converges in Lp to 1c — (C). Hence, the set of functions 
(1c — Ed cI0 7*6) :k > 1, C € 0*6) is dense in L® for the norm || - ||p, and it is 
sufficient to prove that (17) holds for functions of this type. 

Let k > 1, and let C e 07^*G. Put f = Ic — E(1c]0-*OC). Since 
0 < E(1c|@-*C) < 1 ae. we have | f| € 1 ae. Since 07/6 c 07*Q for every 
j > k, we have 


(1) E(f]80 7€) = E(1c]0 7} 6) — E(E(1c|0~*@) 6-7) =0 
for every j > k. 

CLAIM. Let Xi be a sub-o -algebra of F , and let be an invertible measure- 
preserving transformation. Then for any integrable random variable Z, we have 
E(Z on|d) = IE(ZI9n22)] 07 ae. 

PROOF. Forany B c X we have 


[ EZonEdu= [15 Zondp— f Gua: onan 
= [wo :zan- f Einz)an 
n 


= | Une EZD ondu = | EZME) ondn. 


RANDOM TRIGONOMETRIC POLYNOMIALS T] 


Since [E(Z|n=)] o 7 is X:-measurable, the result follows from the uniqueness of 
the conditional expectation. LJ 


For every 1 <i <n, put X; = f o 0"*1^, and for i > n put X; = 0. Since 
|f| <1, also |X;| < 1. As usual, for any / > 1 put F) =o (X1, ..., Xj). Since f is 
Q-measurable, X; is g—“+1—-1) Q- measurable, so $£$;cg-"ti-De Letl<j<l 
with | — j > k. Using Claim and equality (1) we have 


E(X1|Fj) =E(E(f o0"*1-1]g-7Pe)|s;) 
= E([E(/10 07? 6)] o 9^*17157) =0 
With our previous notation, we obtain 


Roa -X1x Iz EY | X ;E(X; EAM 


i-lj-i 


=D ie ED IXEC [flos x n +nk. 


j=li=j 


By Proposition 6.1, there exists some universal constant Cp, such that 




















x e Xil xCpknlog(n + 1). 
a T mx) 
Then it follows easily that 
max Lefo < SkCpVnlog(n + 1), 
te[—z, x1 » 




















where C f p = KE s. 


REMARKS. 1. The estimate (17) considerably improves the estimate obtained 
in the proof of Theorem 4 in [2]. Similarly, our results can be applied to improve 
Theorems 6 and 7 of [2]. 

2. The dense set of functions appearing in Proposition 6.3 is the same as the 
one considered in [2]. 

3. The constant Cf may be chosen uniformly for {f o 0! :1 € Z}. Hence the 
estimate (17) can be obtained along blocks, and we may apply Proposition 2.6 (see 
also the remarks after it) in order to deduce uniform convergence of the one-sided 
rotated Hilbert transform, even with rate. 

4. As noticed by Assani, Proposition 6.3 gives an example of a Wiener-Wintner 
dynamical system of all powers 0 < o < 1/2. 


78 G. COHEN AND C. CUNY 


Acknowledgments. The manuscript was completed during the first author's 
postdoctoral fellowship at the E. Schróedinger Institute, Vienna. The first author 
is very grateful to Paul Furhmann for his advice and encouragement. Both authors 
are very grateful to Michael Lin for mathematical discussions and his constant 
enthusiasm. 


REFERENCES 


[1] ASSANI, I. (1998). A weighted pointwise ergodic theorem. Ann. Inst. H. Poincaré Probab. 
Statist. 34 139—150. MR1617709 

[2] ASSANI, I. (2003). Wiener-Wintner dynamical systems. Ergodic Theory Dynam. Systems 23 
1637-1654. MR2032481 

[3] ASSANI, I. (2003). Wiener Wintner Ergodic Theorems. World Scientific, River Edge, NJ. 
MR1995517 

[4] BOUKHARI, F. and WEBER, M. (2002). Almost sure convergence of weighted series of con- 
tractions. Illinois J. Math. 46 1-21. MR1936072 

[5] COHEN, G. and Cuny, C. (2005). On Billard's theorem for random Fourier series. Bull. Polish 
Acad. Sci. Math. 53 39-53. 

[6] COHEN, G. and CUNY, C. (2006). On random almost periodic series and random ergodic 
theory. Ann. Probab. To appear. 

[7] COHEN, G. and LIN, M. (2003). Laws of large numbers with rates and the one-sided ergodic 
Hilbert transform. Ilinois. J. Math. 47 997-1031. MR2036987 

[8] COHEN, G. and LIN, M. (2006). Extensions of the Menchoff-Rademacher theorem with ap- 
plications to ergodic theory. Israel J. Math. To appear. 

[9] CONZE, J.-P. and RAUGI, A. (2003). Convergence of iterates of a transfer operator, appli- 
cation to dynamical systems and to Markov chains. ESAIM Probab. Statist. 7 115-146. 
MR1956075 

[10] Cuny, C. (2003). Un TCL avec vitesse pour la marche aléatoire gauche sur le groupe afine 
de R2. Ann. Inst. H. Poincaré Probab. Statist. 39 487-503. MR1978988 

[11] CUNY, C. (2005). On randomly weighted one-sided ergodic Hilbert transforms. Ergodic Theory 
Dynam. Systems 25 89-99; 101—106. MR2122913 

[12] CUZICK, J. and LAI, T. L. (1980). On random Fourier series. Trans. Amer. Math. Soc. 261 
53-80. MR576863 

[13] DEDECKER, J. (2001). Exponential inequalities and functional central limit theorem for ran- 
dom fields. ESAIM Probab. Statist. 5 77-104. MR1875665 

[14] DooB, J. L. (1953). Stochastic Processes. Wiley, New York. MR58896 

[15] DOWNAROWICZ, T. and IWANIK, A. (1988). Multiple recurrence of discrete time Markov 
processes II. Collog. Math. 55 311—316. MR978928 

[16] FAN, A. and SCHNEIDER, D. (2003). Sur une inégalité de Littlewood-Salem. Ann. Inst. H. 
Poincaré Probab. Statist. 39 193-216. MR1962133 

[17] FELLER, W. (1966). An Introduction to Probability Theory and Its Applications I, 2nd ed. 
Wiley, New York. 

[18] FERNIQUE, X. (1974). Régularité des trajectoires de fonctions aléatoires Gaussiennes. 
Springer, New York. MR413238 

[19] HENNION, H. and HERVE, L. (2001). Limit Theorems for Markov Chains and Stochastic Prop- 
erties of Dynamical Systems by Quasi-Compactness. Springer, New York. MR1862393 

[20] KAHANE, J.-P. (1985). Some Random Series of Functions, 2nd ed. Cambridge Univ. Press. 
MR833073 

[21] KRENGEL, U. (1985). Ergodic Theorems. de Gruyter, Berlin. MR797411 


RANDOM TRIGONOMETRIC POLYNOMIALS 79 


[22] LEDOUX, M. and TALAGRAND, M. (1991). Probability in Banach Spaces. Springer, Berlin. 
MR1102015 

[23] MARCUS, M. B. and PISIER, G. (1981). Random Fourier Series with Applications to Har- 
monic Analysis. Princeton Univ. Press. MR630532 

[24] MARCUS, M. B. and PISIER, G. (1984). Characterizations of almost surely continuous 
p-stable random Fourier series and strongly stationary processes. Acta Math. 152 
245-301. MR741056 

[25] MENCHOFF, D. (1923). Sur les séries des fonctions orthogonales, I. Fund. Math. 4 82-105. 

[26] MÓRICZ, F. (1976). Moment inequalities and the strong laws of large numbers. Z. Wahrsch. 
Verw. Gebiete 35 299-314. MR407950 

[27] SZ.-NAGY, B. (1974). Unitary Dilations of Hilbert Space Operators and Related Topics. Amer. 
Math. Soc., Providence, RI. MRA482291 

[28] Sz.-NAGY, B. and FOIAS, C. (1970). Harmonic Analysis of Operators on Hilbert Space. 
North-Holland, Amsterdam. MR275190 

[29] PALEY, R. E. A. C. and ZYGMUND, A. (1930, 1932). On some series of functions, I; II; HI. 
Proc. Cambridge Philos. Soc. 26 337-357; 26 458-474; 28 190-205. 

[30] PARROTT, S. (1970). Unitary dilations for commuting contractions. Pacific J. Math. 34 
481-490. MR268710 

[31] RIESZ, F. and Sz.-NAGY, B. (1990). Functional Analysis. Dover, New York. MR1068530 

[32] Rio, E. (2000). Théorie asymptotique des processus aléatoires faiblement dépendants. 
Springer, Berlin. MR2117923 

[33] ROSENBLATT, J. (1989). Almost everywhere convergence of series. Math. Ann. 280 565-577. 
MR939919 

[34] SALEM, R. and ZYGMUND, A. (1954). Some properties of trigonometric series whose terms 
have random signs. Acta Math. 91 245-301. MR65679 

[35] SCHNEIDER, D. (1997). Théorémes ergodiques perturbés. Israel J. Math. 101 157—178. 
MR1484874 

[36] WALTERS, P. (1982). An Introduction to Ergodic Theory. Springer, New York. MR648108 

[37] WEBER, M. (2000). Estimating random polynomials by means of metric entropy methods. 
Math. Inequal. Appl. 3 443-457. MR1768824 

[38] ZYGMUND, À. (1968). Trigonometric Series, corrected 2nd ed. Cambridge Univ. Press. 


MR236587 
DEPARTMENT OF MATHEMATICS DEPARTMENT OF MATHEMATICS 
BEN GURION UNIVERSITY UNIVERSITY OF NEW CALEDONIA 
P.O.B. 653 EQUIPE ERIM 
841105 BEER SHEVA B.P. 4477 
ISRAEL F-98847 NOUMEA CEDEX 
E-MAIL: guycohen Q ee.bgu.ac.il FRANCE 


E-MAIL: cuny @univ-ne.nc 


The Annals of Probability 

2006, Vol. 34, No, 1, 80-121 

DOE: 10.1214/009117905000000378 

© Institute of Mathematical Statistics, 2006 


MAXIMA OF ASYMPTOTICALLY GAUSSIAN RANDOM FIELDS 
AND MODERATE DEVIATION APPROXIMATIONS TO BOUNDARY 
CROSSING PROBABILITIES OF 
SUMS OF RANDOM VARIABLES WITH 
MULTIDIMENSIONAL INDICES 


By HOCK PENG CHAN! AND TZE LEUNG LAI? 
National University of Singapore and Stanford University 


Several classical results on boundary crossing probabilities of Brownian 
motion and random walks are extended to asymptotically Gaussian random 
fields, which include sums of i.i.d. random variables with multidimensional 
indices, multivariate empirical processes, and scan statistics in change-point 
and signal detection as special cases. Some key ingredients in these exten- — ^ 
sions are moderate deviation approximations to marginal tail probabilities 
and weak convergence of thé conditional distributions of certain “clumps” 
around high-level crossings. We also discuss how these results are related to 
the Poisson clumping heuristic and tube formulas of Gaussian random fields, 
and describe their applications to laws of the iterated logarithm in the form 
of the Kolmogorov—Erdés—Feller integral tests. 


1. Introduction. The goal of this paper is to extend a number of classi- 
cal results on boundary crossing probabilities of Brownian motion and random 
walks to much more general stochastic processes involving multidimensional in- 
dices (1.e., random fields). These extensions were motivated by applications to 
signal detection and change-point problems; see Example 2.2 and the last two 
paragraphs of Section 4. Other applications include the laws of the iterated loga- 
rithm for sums of i.i.d. random variables with multidimensional indices (see Sec- 
tion 3), Kolmogorov-Smirnov statistics of multivariate distributions and sums of 
linear processes with long-range dependence (see Section 4). To begin with, let 
(W (t) :t > 0) be Brownian motion and let T, = inf[t > 0: W(t) > b-(t)} be the 
first time when Brownian motion crosses a positive continuously differentiable 
boundary be. Strassen [34], Jennen and Lerche [22], Wichura [37] and others have 
shown that T, has a density function p, and that under certain additional conditions 
on be, p, has the “tangent approximation" 


(1.1) pelt) = t? Pa (r)p(be(t)/ At), 


Received April 2004; revised November 2004. 
l Supported by the National University of Singapore. 
2Supported by the National Science Foundation and the Institute of Mathematical Research at 
University of Hong Kong. 
AMS 2000 subject classifications. Primary 60F10, 60G60; secondary 60F20, 60G15. 
Key words and phrases. Multivariate empirical processes, moderate deviations, random fields, in- 
tegral tests, boundary crossing probability. 


80 


MAXIMA OF RANDOM FIELDS 81 


where g(x) = (27)~1/2e-*”/? is the standard normal density function and a, (t) = 
b.(t) — tb. (t). Note that in the case of a linear boundary b,(t) = a + Bt (with 
a > 0 and f > 0), the well-known Bachelier-Lévy formula yields p(t) = 
t—3/2ao(b-(t) /./t), so (1.1) simply replaces a by the intercept œe(t) of the tan- 
gent line passing through (t, be(t)), and is therefore called a “tangent approxima- 
tion.” For concave boundaries b,(t) = b(t) that become infinite as £ — oo, one 
typically has b’(t) = o(b(t)/t), so one can replace ac(t) in (1.1) by b(t). There 
is a close connection between this approximation to pe(t) and the Kolmogorov— 
Erdés—Feller test, which yields for nondecreasing b(t)/4/t the 0—1 dichotomy 


(1.2) P(W(t) < b(t) for all large t} = 1 (or 0) if 4(b) < co (or = oo), 


where £(b) = [PO t "^ b(t)e(b(t)/ At) dt < oo. Similarly, if Sn = X1 +--+ Xn 
with EX; =0, EX? — 1 and E|Xi[^ < oo, then for all n > 1, 


(1.3) P{S, « b(n) for all large n} = 1 (or 0) if £(b) < co (or = oo). 


If we think of the random walk {$}, n > nc} in (1.3) as an “asymptotic” Brown- 
ian motion as ne — oo, then (1.3) can be regarded as the generalization of (1.2) 
to processes that behave like Brownian motion. This suggests that if (1.1) and 
(1.2) can be extended to more general Gaussian processes, then they may even 
be expected to hold much more generally for processes that are "asymptotically 
Gaussian." In view of the functional central limit theorem for sums of weakly 
dependent or long-memory random variables, the scope of applications of such re- 
sults would be very broad. Unfortunately, functional central limit theorems, which 
are about the "central" part of the limiting Gaussian distributions, are not the right 
tools to handle the "rare" events in the high-level crossings as in (1.1) and (1.3). 

To extend (1.1) and (1.2) to much more general processes, our approach uses 
(i) moderate deviation approximations to marginal tail probabilities and (ii) weak 
convergence (to a limiting Gaussian process) of a certain conditional process given 
that the process attains a high level near the boundary at time t. Another key idea 
of our extension is to relax the requirement that the left-hand side of (1.1) be a 
first exit density. Instead we regard it as a “local” exit density at time ¢ so that 
the probability that the process ever crosses the boundary within time interval D 
is asymptotically equal to the integral of the right-hand side of (1.1) over D. Not 
only does this avoid the technical assumptions that need to be imposed to ensure 
that the first exit time 7; indeed has a density with respect to Lebesgue measure, 
but it also dispenses with the notion of having a well-ordered set D so that the 
“first” time of exit can be defined. This enables us to extend our approach to ran- 
dom fields (with multidimensional time that is not well ordered). Section 2 gives 
basic assumptions for these “asymptotically Gaussian" random fields and states 
the main theorems that provide generalizations of (1.1) and (1.3). Applying these 
theorems to Gaussian random fields yields new results in Theorem 2.1 for the max- 
ima of Gaussian random fields. Section 5 gives the proofs. Connections to Aldous' 
[4] Poisson clumping heuristic and the Hotelling—Wey] tube formulas are also dis- 
cussed in Section 2. 


82 H. P. CHAN AND T. L. LAI 


2. Basic results and discussion. We begin with some notation that will be 
used throughout the paper. Let y (c) = (2: c?) V? exp(—c?/2). For vectors t, u € 
R, the relation ¢ < u means f; < u; for all i and t < u means t; < uj for all i. 
Also |-| will be used to denote the greatest integer function, || - || the (Euclidean) 
norm of a vector, |. | the determinant of a square matrix and v(-) the d-dimensional 
volume (or content) of a Jordan measurable set. For ¢ > 0, let 

d 

lrt = | [ir ti +4). 

i-1 
For D c R^ and ô > 0, define [D]; = {t + u:t € D, ||ul| < 8). We shall also 
use V and V? to denote the gradient vector and Hessian matrix, respectively, 
of a function. Let S¢~! denote the (d — 1)-dimensional unit sphere, and let Z ,. 
(R+) denote the set of positive integers (real numbers). Let 0 < o < 2 and let 
(W;(u) :u € [0, o0)4] be a continuous Gaussian random field (whose En 
follows from Theorem 2.1 of [25]) such that 


W,(0) = 0, 
E[W,(u)] = — wl rr Q/ 11/2, 
Cov(W; (wu), W; (v)) = [lll re i/u) + vil reCv/|u ll) 
— |lu — vl ri (u — v)/lļu — vIl)]/2, 


where r; : S471 — R, is a continuous function satisfying 


(2.1) 


(2.2) sup |r (v) — ry(v)| > 0 as u — t. 
vesé-! 
Of particular importance in the subsequent development are 
OO 
Hx (t) «T ep] sup W,(u)> yh dy, 
0 O0<uj<K Vi 


(2.3) 
H(t) = Jim K^^Hk(), 


which are shown to be well defined in Theorems 2.4 and 2.5. 
Let X be a stationary, isotropic Gaussian random field such that E X (0) = 0, 
E X^(0) = 1 and 


(2.4) E[X (0) X (u)] = 1 — (1+ 0(1)) ul? Zul) as u — 0, 
for some 0 « a < 2 and slowly varying function L. Let 
(2.5) Ac = min{x > 0:x*L(x) = Qc?) ^]. 


For example, if L(x) = 1, then A, = (2c?) -V/*. Let D be a bounded, Jordan mea- 
surable set such that.[ D]5 lies in the domain of X for some à > 0. Then by Theo- 
rem 2.1 of [31], 


(2.6) P [sup X(t)> c} ~ poA Ar (DH, 


MAXIMA OF RANDOM FIELDS 83 


where H = limg soo K` fy e? P(supoz,, «y v; Wolu) > y) dy is a positive, fi- 
nite constant and Wọ is the Gaussian random field defined in (2.1) with ro(u) = 1. 
Our goal is to extend (2.6) first to more general Gaussian random fields satisfying 


(2.7) EI[X()X(t wu)] 21— (1-00) Mul" LCulDrsQu/lul) asu — 0, 


uniformly over t € [D];. We then extend (2.6) to non-Gaussian random fields that 
are asymptotically Gaussian in a moderate deviation sense. 


2.1. Gaussian random fields. Let X be a Gaussian random field such that 
EX (t) 20, EX*(t) = 1 for all t. Let D be such that [D]; is a subset of the do- 
main of X for some 6 > 0. The following theorem, whose proof is given in Sec- 
tion 5, generalizes (2.6) far beyond the stationary isotropic framework considered 
by Qualls and Watanabe [31] under (2.4). 


THEOREM 2.1. Suppose the Gaussian random field X satisfies condi- 
tion (2.7), in which 0 < a <2 and r,:S¢~! — R4 is a continuous function such 
that the convergence in (2.2) is uniform in t € [D]s and sup,atpy, vega-i riv) < co. 
Then, with H (t) defined by (2.3), 


(2.8) P| sup X(u)> ct ~ £r (c) H (t) 
Vue te Ac 

uniformly over t € D, as c — oo and łe > oo with le = o(A; D. Moreover, if D is 

bounded and Jordan measurable, then as c — oo, 


(2.9) P [sup X(t) > cH ^ voaz’ | H(t) dt. 
teD D 


The following special case of Theorem 2.1, with d = 2, demonstrates the use- 
fulness of including the function r, on S4-! in (2.2) when (2.1) is extended to 
nonstationary Gaussian random fields. It will be discussed further in Example 2.10 
and at the end of Section 4. 


EXAMPLE 2.2. Let X(t1,t2) = (h — t) V ?[W (t5) — W (tj)], where W(-) is 
Brownian motion, and D = (($5,155):0 < ti < h <a, ai < h — ti € az} with 0 < 
aj <a <a. Then 
lui] + |u2] 

2(t2 — t) ' 

as u — 0. Hence (2.7) is satisfied with a = 1, L(]]|u|) = 1 and r;(u) = (uil + 
iu2D /[2(£5 — t1)]. Therefore A, = (2c?)-! in view of (2.5), and H (t) — 2-4 (fo — 
4) 7 by Lemma 2.3 below. Application of Theorem 2.1 then yields that as c — oo, 


P((t5 —6)- 2 [W (5) — Wt) > 


E[X (0) X (t -u)] = 1— (1 +0(1)) 


84 H. P. CHAN AND T. L. LAI 


for some 0 < t; < h <a, d1 < h — tı < az} 


(2.10) ~ POB? Í, 27*(t — t ? dti dt 


-vocto [| Odds 
= #(c)(c*/4)[a(ay! — a5!) — 1og(ao/a1)]. 


LEMMA 2.3. Let {W;(u):u € [0, o0Y4] be a continuous Gaussian random 
field such that for some positive functions B1, .. . , Ba, 


d 
E[W.(u)] = — } Bi(ui/2 
[=] 


and 


d d 
Cov[W; (u), Wr(v)] = Y Bi(t (ui + vi — ui — vil) /2 = Y Bi (t) minu, vi). 
i=] i=] 


Then H(t) =2~4]]4_, Bi(t). 


PROOF. For u> 0, W;(u) = Soe Bj,i(uj), where {Bi t}i<i<a are indepen- 
dent Gaussian processes with independent increments, E[B; ;(u;)] = —Bi(t)ui/2 


and Var(B; t (u;)) = Bi (t)ui, so Bi (uj) £ W(Bi(t)ui) — Bi(t)ui/2. As K > œ, 


ET d 
Hx) = | ero sup Bu) > y | d 


Dl O<u; < 


60 d 
=| © = DP|S sup Bu) a 


izi 0zu; zK 


d 
~ Ele: sup Bu) || 


i=1 0sui SK 


= I] E fexp( sup LAP: (t)ui) — Bi (w;/2)) | 


xil O<u;< 


= 


so Hx(t) ~ d 1B; (t)K /2]; see [18], (1.8.11) for the last asymptotic relation. 
O 
2.2. Asymptotically Gaussian random fields. Theorem 2.1 is derived in Sec- 


tion 5 as a special case of a more general result on asymptotically Gaussian random 
fields satisfying conditions (C) and (A1)-(A5) below. Specifically, for c > 0, let Xe 


MAXIMA OF RANDOM FIELDS 85 


be random fields such that E X. (t) = 0, E X2(t) = 1 for all c and t. Let D be such 
that [D]s is a subset of the domain of X, for some ô > 0 and all c large enough. 
Define pc(t, u) = E(Xc(t) Xc(u)]. In analogy with (2.7), assume that there exist 
0 « o x 2 and a slowly varying function L such that as u — 0, 


(C) pct, t +u) =1 — (12-o())Iul LCbulDrsQu/ lu) 


uniformly over t € [D]; and compact sets of u/ Ac > 0. Moreover, assume that the 
following conditions also hold uniformly over t € [D]s, as c — oo: 


(A1) P(Xc(t) >c — y/c}  v(c — y/o) 


uniformly over positive, bounded values of y. The convergence in (2.2) is assumed 
to be uniform in f € [D]s, with sup,erp}, vesa-1 (v) < oo. Moreover, for any 
a > 0 and positive integers m, as c — oo, 


{c[X¢(t + akAc) — X-(t)]:0 S ki < m}|X-(t) =c — y/c 
(A2) 
=>  [(Wi(ak):0 x k; < m} 


uniformly over positive, bounded values of y, where we use “| Xe) =c — y/c" 
to denote that the distribution is conditional on X;(t) = c — y/c. In addition, there 
exists a positive function A such that lim, o5 A (y) = 0 and 


(A3) P(Xc(t -uAc) >c—y/ce, Xt) <e — y/c] x h(y)v (c) 


for all u > 0 and y > 0, and there exist nonincreasing functions Na on R} and 
positive constants yg such that y; — O and N5(y4) + hai w Na(ya + w)d@ = 
o(a?) as a — 0, and 


(A4) P| sup Xo(t-+udc) >, Xe) <c = y/o} < NGC. 
O<u<a 

for all y; € y < c and s >Q. 

Whereas (A1) refers to the marginal distribution of X, (t), saying that {X,(¢) > 
c — y/c) has probability like that of a standard normal, the joint distribution of 
X.(-) is assumed in (A2) to be asymptotically normal in the sense of weak con- 
vergence for local increments conditioned on X,(t) = c — y/c. Note that the same 
a, L(-) and r;(-) appear in (C) and the mean and covariance functions (2.1) of 
the Gaussian field W,(-) in (A2). In fact, if X, = X is a Gaussian field satisfy- 
ing condition (C), then (A2) holds; see the proof of Theorem 2.1 in Section 5. 
Assumptions (A3) and (A4) are mild technical conditions under which the proba- 
bility of sup, . ere X.c(u) exceeding c can be computed via (A1) and (A2) after 
the cube I; kA, = rif. f; + KA,) is discretized by the grid points t + ka A, 
(0 < k; < m) with a = K/m, leading to the following. 


86 H. P. CHAN AND T. L. LAI 


THEOREM 2.4. Let K > 0. Assume (C) and (A1)-(A4). Then as c — œ, 


P| sup X-(u) > «| ~ VG)U + Hx (t)] 
uel K Ac : 

uniformly over t € [D]s, where Hy (t) is defined in (2.3) and is finite and uniformly 
continuous in t € [D]s. 


To derive an analogue of (2.6) for P{sup,cp Xc(t) > c) in which Xe satis- 
fies (C) and (A1)-(A4), we can sum the asymptotic formula in Theorem 2.4 over 
t € (KADENA D if the joint occurence of two events (associated with two such 
cubes) is negligible in comparison with the probability associated with a single 
cube. The following simple condition ensures this: There exists a nonincreasing 
function f :[0, oo) > R+ such that f (lr) = O(e-l" I^) for some p > 0 and for 
all y > 0 and c sufficiently large, 


(A5) P[Xc(t) - c — y/o, Xet -uAc) - c— y/e) S v(c — y/o) f (ull) 
uniformly in ¢ and t + uA, belonging to [D]s. 


THEOREM 2.5. Assume (C) and (A1)-(A5). Then as c > oo and £, — oo 
such that łe = o(A; ., 


(2.11) P] sup X.) > c] ~ OH) 


u€l, t. Ac 


(2.12) P] sup Xe(u)>c, sup Xelo) > c] = otto). 


u€li t. Ac v€ BM, eee 


uniformly over t € D and over subsets B of [D]; with bounded volume, where 
H (t) is defined in (2.3) and is uniformly continuous and bounded below on D. 


Dividing (2.11) by (€-A,)?, which is the volume of 1; ¢.A., yields an asymp- 
totic boundary crossing "density" A V (c) H (t) of X, at t. By integrating this 
"density" over D, or more precisely, by summing (2.11) over the “tiles” I; e A. of 
D and applying (2.12) together with the fact that D is bounded and Jordan measur- 
able, we obtain the following generalization of the Qualls—Watanabe result (2.6) 
on stationary isotropic Gaussian random fields. 


COROLLARY 2.6. Assume (C) and (A1)-(A5). Let D be a bounded, Jordan 
measurable set. Then 


(2.13) P [sup xet) > e ~ V (c) Az * f, H (t) dt as c — oo. 
€ 


MAXIMA OF RANDOM FIELDS 87 


We can extend Corollary 2.6 to sets D, that grow with c. The assumption that 
D be Jordan measurable [i.e., for any £ > 0, the boundary 8D of D can be cov- 
ered by rectangles U1, U2, ... such that 97, v(Uj) < e] and bounded in Corollary 
2.6 is used to show that ) ,c(^z4 j, napzø VUt,¢) > 0 as ¢ 0. When working 
with sets D, that need not be bounded, we need to impose a more direct assump- 
tion (2.14) on the contribution of dD, to the Riemann sum. Moreover, by condi- 
tion (C) or (A1)-(A5), we now mean that it holds uniformly over t belonging to 
[De]s. 


COROLLARY 2.7. Assume (C), (A1)-(A5) and that 


(2.14) sup lu—tl|— O(c") and gf 2 H(t) = o(v(Do)) 
tuc D, telte Z), lr t NIDAD 


for some k > 0 and positive C, with e > 0 and Ac = o(te). Then as c > oo, 


(2.15) P] sup X(t) > c| ~ voaz f H (t) dt. 


teD, 


2.3. Boundary crossing probabilities. To extend the conclusion of Corol- 
lary 2.7 to the boundary crossing probability P{X-(t) > be(t) for some t € Dg}, 
we proceed similarly by using the probabilities pe(t) = P{X_(s) > b-(s) for some 
S € 1; ¢,} as building blocks, where ¢, — 0 is so chosen that 


(2.16) sup Ap.) = o(tc) (hence ; 1 be(t) > oo) as c — oo. 
Elice J8 


te[Dels 


Whereas (A1)-(A5) are related to the time-invariant boundary c to be crossed 
by X,(-), we can formulate similar assumptions when c is replaced by a time- 
varying boundary b,(-). Let 


(2.17) b.= inf b,(u), be= sup b,(u). 
uel Dcls uctD.]s 


Analogous to (A1)-(A5), assume that the following conditions hold, as c — oo, 
uniformly in t € [De]s and b./2 <z < be: 
(B1) P{X-(t) >z} ~ Yk), 
{z[Xe(t +akAz) — Xe(t)]:0 < ki < m}| X(t) =z — y /z 
=> {W;(ak):0 x ki < m}, 


for any a > 0 and positive integers m, the convergence being uniform over positive, 
bounded values of y ; moreover, the convergence in (2.2) is assumed to be uniform 
int € [Dc], with sup;erp, y, vesa-1 r;(v) < oo. In addition, there exists a positive 
function A such that limy—.o. h (y) = 0 and 


(B3) P(Xc(t uA; »z—y/z, Xet) <z— y/z} < hyk) 


(B2) 


88 H. P. CHAN AND T. L. LAI 


for all u > 0 and y > 0, and there exist nonincreasing functions Na on R+ and 
porive constants yg such that ya — 0 and Na (Ya) + fio Na(ya + ©) do = 
o(a?) as a — 0, and 


4) P| sup Xe +UA) >z Xel) <z—y/2} < NV) 
O<u<a 

for all yg € y < z and s > 0. Moreover, there exists a nonincreasing function 

f :[0, 00) > R+, with f (lril) = O (e-lr^) for some p > 0, such that for y > 0 

and c sufficiently large 

(B5) PIXc(t) 2»z - y/z, Xc(t +uAz) » z - y/zi S v(z — v/2) f (lull) 

uniformly in ¢ and t + uA, belonging to [De]s. 


THEOREM 2.8. Assume (C) and (B1)-(B5). Suppose that (2.14) and (2.16) 
hold for some k > 0 and te — 0 and that 


—2 | 
sup [b,(t) — b4(0] — o(1) 
te[Dc]o;. 


where b.(t) = sup b,(u), OG) = n bean). 


uel; ite 


(2.18) 


Then P{X,(t) > b.(t) for some t € De} ~ Jo. V V (be(t)) Ay, nH) dt as c — oo. 


The next corollary specializes Theorem 2.8 to the case in which b,(t) = cb(t) 
for some positive function b possessing continuous second derivatives on [D]s, 
where D is a compact Jordan measurable set. Let bp = infrep b(t) and assume 
that M = {t € D: b(t) = bp] is a q-dimensional manifold (with boundary) such 
that v; (-M N 9 D) = 0, in which v, denotes the g-dimensional volume element of 
the manifold. Let T AC (t) denote the normal space of the manifold M at f. Letting 
{e1(¢),..., eq-4 (0)) be an orthonormal basis of T M-- (t), define the d x (d — q) 
matrix A() = = (e1(t)---eq—q(t)) and assume that V1 b(t) := A'(t)V*b() AQ) is a 
positive definite q x q matrix for all t € M. 


COROLLARY 2.9. Suppose (C) and (B1)-(B5) are satisfied with a < 2 and 
D, = D, a compact Jordan measurable set. Then as c > co, 
P(X.(t) > cb(t) for some t € D} 
(2.19) 
~ Ar (cbp)b A" A-d Qn Jc] pd | IV2 DATH Ov (dt). 
M 


EXAMPLE 2.10. Let X(f1,12) = (t2 — t) IW (£2) — W(t1)] and Xe = X 
as in Example 2.2, where W(-) is Brownian motion, and let b,(t), t2) = [c? + 
2log(f? — n) 5^ for some 6 > 1. Let D = ((t,55):0 tj < h <a,a, S 


MAXIMA OF RANDOM FIELDS 89 


to — ty < az} where 0 < a, < a x a. Arguments similar to those used to prove 
Theorem 2.1 in Section 5 can be used to show that (B1)-(B5) hold uniformly in 
t € [D]s and b,./2 x z < be. Therefore by Lemma 2.3 and Theorem 2.8, 


P5 t) ^W (t2) — W(t,)|> [c? + 2log(t, — t) f] 


for some 0 < t1 < h <a, a] < h — t <a} 


^ v ()Qc^y | 24 PF logt) (4. — t) dti dh 
D 


s D d s ?*? dt ds 


EINE (ay Et] 





2.4. Discussion and related literature. Our formulation of “asymptotically 
Gaussian" random fields bears some resemblance to Aldous's [4] Poisson clump- 
ing heuristic, which involves i.i.d. clumps of high-level excursions of a stochas- 
tic process X(t), with the stochastic structure of the clump determined by the 
conditional limiting process [like that in (A2)] of normalized local increments. 
Whereas the Poisson clumping heuristic only suggests an asymptotic approxima- 
tion P(sup,cp Xc(t) < c} of the form e`?! with pe — 0, our approach actually 
gives a rigorous derivation of an asymptotic formula for pe. Instead of a sin- 
gle stochastic process X(t), our formulation involves a family of random fields 
X.(t) with E X.(t) — 0 and Var(X,(t)) = 1. It consists of two basic components: 
(i) a normal approximation to the probability of X. (f) exceeding some high level 
(depending on c) in (A1) or (B1), and (ii) the weak convergence of the finite- 
dimensional distributions of the local increments conditioned on X.(t) =c — y/c 
in (A2) [or (B2)]. The covariance structure of the local increments given by condi- 
tion (C) and the closely related mean and covariance functions (2.1) of the limiting 
Gaussian random field in (A2) [or (B2)] provide the key ingredients in the asymp- 
totic formulas in Corollaries 2.6, 2.7 and Theorem 2.8. Theorem 2.1 and its proof 
show that these asymptotic formulas are the same as in the special case X, = X, 
a zero-mean Gaussian random field satisfying condition (C). These asymptotic 
formulas are derived by adding up corresponding results for small cubes in (2.11), 
making use of (2.12) to justify the additivity. 

Conditions of the type (A2) were introduced by Berman ([6], Theorem 5.1) for 
asymptotic approximations (as c — oo) to the probability P (supo.,.-7 X(t) > c) 
of a stationary process X (f) (with d = 1) such that X (0) belongs to the domain of 
attraction of an extreme value distribution; see [6], Theorem 14.1 and [3], Theo- 
rem 1. We consider here general d, extend X (1) to X. (t) and remove the stationary 
assumption, but restrict the limiting distribution in (A2) to be Gaussian and the 
marginal probabilities P{X,(t) > c — y/c} to be asymptotically normal. It will be 


90 H. P. CHAN AND T. L. LAT 


shown in Sections 3 and 4 that extending a single stationary process X (1) to a fam- 
ily of possibly nonstationary random fields X,(t) and generalizing the threshold c 
to a moving boundary b. (t) greatly broaden the scope of applications. Some of the 
difficulties in proving these extensions to the nonstationary setting are explained 
in Remark 5.1. 

Corollary 2.9 and its proof in Section 5 reveal similarities and differences be- 
tween our approach and the tube formulas of Hotelling [21] and Weyl] [36] whose 
applications to the maxima of Gaussian random fields are reviewed in Section 6 
of [1]. As in [8], the use of the tubular neighborhood Us, of the extremal mani- 
fold M in the proof of Corollary 2.9 is related to Laplace's method for asymptotic 
evaluation of the integral f p, V (bc OAH (t) dt, in which the integrand can 
be regarded as an “asymptotic density” of crossing the boundary be by Xe at t 
(see the paragraph following Theorem 2.5). Differential geometric considerations 
arise naturally in applying Laplace's method to integrate the asymptotic boundary 
crossing density, and clearly also in the Euler characteristic and tube formulas of 
excursion sets in [1]. 


3. Sums of i.i.d. random variables with multidimensional indices and as- 
sociated Kolmogorov—Erdés—Feller test. Let Yk, k € Z^, be i.i.d. random vari- 
ables with 


(3.1) EY,—0, EY=1, El% «oo. 


Let Sp = k-n Yk, where k < n denotes that k; < n; for 1 <i <d, as 
in Section 2. Let [n| — Il. ni, logn = (logni,...,logng) and exp(t) = 
(exp(t1), ..., exp(tg)). Define X(logn) = in} 1/754 and extend the domain of X 
to [0, oo)? by defining X(t) = X (logn) when logn; < t; < log(n; + 1) for all i. 


Let X, = X, pe = p and let D, be a Jordan measurable subset of (t: $; t; > c?). 
If t = logn and t + u = log m for some m, n € Zé, then 


1 — p(t,t+u) = 1 — Cov(In| ^ Sq, [m| ^ Sq) 


—1- ev(-X i) ~ $ luil/2 


(3.2) 


I 


+ 


i 


as u — 0. From (3.2), it follows that (C) holds with a = 1, L(x) = 1, r;(u) = 
$; [u;!/2, and therefore A; = (2c2)-1 by (2.5). Moreover, by the Berry—Esseen 
theorem (cf. [15], Theorem 16.4.1), for logn < t x log(n + 1), 


(3.3) P{X(t) > c— y/o} — Í JL Qm eth d; = O(In| 12) 


uniformly over c and y. Since log [n|/c? — oo uniformly over t € [De], it fol- 
lows from (3.3) that (A1) holds. Moreover, as will be shown in Lemma 3.6, 


MAXIMA OF RANDOM FIELDS 91 


(A3) and (A4) are satisfied uniformly over [D,]3. If we assume in addition that 
for some £ > 0 and x > 0, 


(3.4) sup [[u-—tj|j2O0(c) and [De]s C Ge := fe / De > £ for all i, 
t,we[De]s 7 


then Lemmas 3.5 and 3.7 show that (A2) and (A5) also hold. Therefore X(t), 
te De, is asymptotically Gaussian, and we shall apply Lemma 2.3 and Corol- 
lary 2.7 at the end of this section to prove the following two theorems. 


THEOREM 3.1. (i) Assume that for some positive €, — 0 with Ac = 0(¢e), 
(3.5) c (te (£2) : ee NOD: X Ø} = o(v(Do5) 


as c — oo. Then Pisupyep, X(t) > c] = O (v( Do)c?! y (c)). 
(11) If (3.4) also holds, then 


(3.6) P{ sup X(t)> c ~iu DA y (c). 
€ Lec 


Theorem 3.1(1) enables us to extend the Kolmogorov—Erd6és—Feller test (1.3) to 
the case of multidimensional time. Let f YA — (0, oo) be nondecreasing in the 
sense that f(m) < B(n) for all m < n. We say that f is an upper (lower) class 
function if 


(3.7) sup(In|:|n| ^S, > 8(n)) < (=) — as. 


For & > 0, let F; = (n € Z^ :logn;/log |n| > e for all i); in particular Fo = Z4. 
Define 
(3.8) J= Y in| 182471 nye FO? 


nc FP, 


THEOREM 3.2. If Jo < œ, then B is an upper class function. Conversely, if 
Jg = co for some & > 0, then p is a lower class function. 


EXAMPLE 3.3. In the case d — 1, since xe-* I? is decreasing in x > 1, it 
follows that [^a t73/2b(t)e 0/2 dt < œ iff °° n-l8(n)e-P^ 0) < co, where 
p(n) = b(n)/./n is nondecreasing. Therefore the integral test (1.3) is equivalent to 
Theorem 3.2, noting that J, = Jo for all 0 < & < 1 in the case d = 1. Next consider 
d = 2 and let f be a positive function on Zi. such that 


Y: niae o, 


n=l no>l 


— T con 
y [n BOr) + Inl! 82 (15,22)]e A < oo. 
n 


92 H. P. CHAN AND T. L. LAI 


Then sup(In|: In| 754 > 80), ni > 2) < oo as. and sup{n2 in» ne Sn > sits 
ni = 1} < oo a.s., by the first part of Theorem 3.2. On the other hand, Jo = 
although J, < oo for every & > 0. This shows the importance of using Je maid 
of Jo for the lower class result in Theorem 3.2. 


Let pm) = = ((2 + 8) d loglog |n|}!/* for |n| > e and 6 > 0. Then by the inequal- 
ity d^! 5$] logn; > (TI, logn;)'/4 between arithmetic and geometric means, 
there exist C, C’ > 0 such that 


Jo X C Y (loglog In)? {Ini dog np *?/2] 


nezi 


d 
sC] De n; (logan CH), 


i=] ni cz. 


so Jo < oo if ô > 0. Take 0 < ¢ < d^! and note that the number of k's such that 
3; ki =m and e* € F, (so that k; > em) is (B + o(1)ym4-! for some B > 0. 
Since Hes 2 oki 71 <n; <eki n;| ~ 1 as mink; — oo, it follows that if 6 = 0, then 
there exists B’ > 0 such that 


Je= 9; Inl "og inl)? > BY Y; m^ =o. 


nc F,,n-ng m-mg 


Hence by Theorem 3.2, 8 (n) belongs to the upper class if 6 > 0 and to the lower 
class if 6 = 0, yielding the following. 


COROLLARY 3.4. limsupg, , co 5n/(2d|n|loglog |n|)'/* = 1 a.s. 


In the case d — 2, Zimmerman [38] proved an analogue of Corollary 3.4 for 
the Brownian sheet, which is a zero-mean Gaussian random field with indepen- 
dent increments and variance function |t], like that of Sn. His result was subse- 
quently strengthened by Orey and Pruitt ((26], Theorem 2.2) who proved that for 
the d-dimensional Brownian sheet W(t), P{W (t)/|t| < f s for all large |t|} = 1 
(or 0) if 


(3.9) I gs (log ge (log log £)4-125- f^ 6/2 d£ < (or — )oo. 


Actually their result considers t — O rather than |t| — oo. However, as 
IW (1/15,..., 1/13) is also a Brownian sheet, one can extend their integral test 
to the preceding statement. Because continuous Gaussian processes (instead of 
discrete-time sample sums) are involved, the tail distribution of the maximum over 
a domain D, does not require condition (3.4); see (2.6) in this connection. Hence 
unlike (3.8), the integral test (3.9) does not involve F,. Instead of the series (3.8), 


MAXIMA OF RANDOM FIELDS 93 


we can rewrite it as an integral when F; is not involved, expressing the conver- 
gence criterion in Theorem 3.2 (taking £ = 0) as the integral test 


ll e| aada, at 


(3.10) : 
x e PG qn... dtg < (or =). 


Note that (3.10) considers more general functions £(t) than those of the form 
f (It]) considered by Orey and Pruitt [26]. In the case B(t) = f(|t]), assuming 
without loss of generality that co < f(€)/(og log £)!/ 2 < c1 for some 0 < cg < c1 
(see the proof of Theorem 3.2), the change of variables £ = f; --- tg in (3.10) shows 
that (3.9) and (3.10) are indeed equivalent. 

Strong approximations of Sn have been developed by Rio [32] who has shown 
that if Yk, ke Z5. are Lid. with EY, =Q, E Y? = ] and E|Yk| < oo for some 
r 7 2, then redefining the random variables on a new probability space yields 


(3.11) sup [Sn — W) = O(v(* P"? (logv)/? +047") ^ as. 


O0<n<vl 


Note that (3.11) bounds the approximation error Sn — W (n) by 


(d—1)/2 1/2 d/r 
( max ni) (iog max ni) uz ( max ni) : 
l<i<d 1xi xd 1xi «d 


instead of by some sufficiently small power of [n| — E nj. Therefore Rio's 
strong approximation (3.11) cannot be combined with the Orey—Pruitt integral test 
(3.9) for W(t) to yield a corresponding integral test for Sn. Example 3.3 shows 
that the integral test (3.9) for W(t) actually does not hold for $an which requires a 
more subtle criterion for a lower class of functions. 

The proof of Theorem 3.1 uses the following three lemmas which show that 
(A2)-(A5) hold under (3.4). 


LEMMA 3.5. Assume (3.4). Let u, v > 0. Then as c — oo, 


E(c[X (t+ uA.) — X (0X (t =c — »/c) 


(3.12) 2 
— — $ u;/4, 
ix] 
ps Covíc[X (t + uA,) — X(t)], cLX (t+ vA,) — XOX =c — y/c} 
d 


— > min(u;, vi)/2, 


i=] 


uniformly over bounded values of y and t € [D.]s. Hence (A2) holds. 


94 H. P. CHAN AND T. L. LAI 


PROOF. Forexp(t) € Z4 and exp(t + uA.) € Z4 , define 


Zu) = P3 Yk 
ik: k<exp(t+uA,)}\{k: k<exp(t}} 
(3.14) 


= X(t+uA,) ox] + Wo — X(t) = 2) 
i i 
Conditioned on X (t) — c — y/c, 


c(X (t 3- uA4) — X (t)) 


= cZ;(u) p| ~X ic ui afa 
(3.15) 


— c(c — y/c) I — ew(-A. Y«n) | 


= cZ«(u) exp (- > n2 — Ac Y«n) — ) ui/4 + o(1). 


Since Z,(u) is independent of X (t), (3.12) follows from (3.4) and (3.15). To see 
this, suppose logn; < f; + uc < log(nj + 1) and logm; < t; < log(m; + 1) for 


— 


1 xizd.By(3.4,1t; >e pi tj > ec? /2 for t € [D.]; and all large c. This implies 
that log(m; 4- 1) — logm; — log(1 4- m; ) < log(1 + e 9€ /2) = o( Ac) and that 
log(n; + 1) — logn; = o(A.), so by (3.14) and (3.15), 
E{c[X(t+uA,) — X(0]]X( =c — y/c} 
d d 
— —5 'logn; — logm;)/4A, + o(1) > —) u;/4. 


similarly, for u, v > 0, 


d 
Cov(Z«(u), Zt(v)) ~ | [fexp(ti + min(ui, Vi) Ac) = exp(t;)} 
ix] 
(3.16) 


ei [nine Al e (Ys) 


and (3.13) follows from (3.4), (3.15), (3.16) since Z4(v) is also independent of 
X(t). O 


LEMMA 3.6. (i) P{maxy<n Sk 2 A} < 2f P {Sn > A — d(2In]) 7]. 

(ii) There exists a positive function h, with lim, oo h(y) = 0, satisfying (A3). 

Gii) There exist nonincreasing functions Ng on R4 and positive constants ya 
such that ya — 0, Na (Ya) + fr y?! Na(ya + y) dy = olaf) for all s > 0, as a — 0, 


MAXIMA OF RANDOM FIELDS 93 
and (A4) holds. 


PROOF. For (i), see [17], Lemma 2.3. To prove (ii), let u > 0, w > 0, y > 0. 
By (3.15), 


(3.137) — c[X(t-- uA) — X(t)] < (1 + 0(1)) Jeztwexo(- Yun 


Since |(k:k x exp(t+uA,)}| — |{k:k x exp(0)] ^ (37; u;/2c?) exp (Y; ti), it fol- 
lows by the independence of Z,(u) and X (t) and the Berry—Esseen theorem that 
for large c, 


P{c[X(t+uA-) - X(01 > y - vyIX(0 =c — y'/c} 


(3.18) < p zo > [o/ — »)/2e] ee (X: 2) 


x v(B(y' — y)) + 0(cem(- Yu: 


for some B > 0, uniformly over y < y! < oc. In view of (A1), we can choose 
Ee — 0 such that P{X (t) > c — y'/c] = (14+ O (&2))y (c — y'/c) uniformly over 
y € y € oc. Let yj =y + j&, j =0,1,.... Then by (3.18), 
P{X(t+uA,) >c—y/c,c—~w< X(t) <c- y/c} 
yj4 
< x P(X (t-- uAc) » c — y/clX(6 — c — y'/c) 
0<j<(we—y)/&, * 7 ! 


x P(X(0 ec — dy'/c) 


SUE mee mal) 


Osj s(oc—y)/&c 
(3.19) x [P{X(t) > c — yj41/c] - PIX (t) > c — yj/c]] 


«(1-o0) Ð veo- 


0xjzx(oc—y)/&c 
+0 ( ex — 5 usa) lewo 


< vola + o(1)) 


x L e y (BC — y))dy' + 0 (cexp| oe — Xal) 


i 


96 H. P. CHAN AND T. L. LAI 


Since jt > c), cexp(oc — Y 1/2) = o(1). Moreover, f; e" (By! — 
y)) dy’ — 0 as y — co. From (3.17) and (3.18), it follows that for large c, 


P(IX (t--uA;) — X] > w} < P [z > (@/2)exp (X i) | 


(3.20) € y(Bco) +0 ( ev(- «wa 


= o(w(c)) 
if we choose w > B^. Hence (ii) follows from (3.19) and (3.20). | 
To prove (id) note that (k:k < exp(t + a1A,)}\ {k:k < exp(t)) = 
Urctt,....d},74@ 4s, Where Aj = (k:exp(fi) < kj x exp(f; + aA,) fori € J and 
ki < exp(ti) for i € J}. By (1), | 


P| sup za) >z] 


0<u<al 


3.21) < sup > Ya >22} 


P| 
Ictt,...d},J4#@ | VK€ÀJ m<k med, 


« 27 P| y. Yu > 224 -aaiy 
JC(1,..,d,Jx9 med, 


Since |A;| ~ (aA) l exp; tj), it follows by (3.17), (3.21) and the Berry- 
Esseen theorem [using the same steps as in (3.18)] that for large c, 


P| sup [X WA.) - X(] » y'KO=c-y'/c| 


0<u<al 


(3.22) <P | sup Zu) > (y'/2c) ee (X: fi n) 


0<u<al 


< Ad y (By Jal? —2'/*d)+ 0 (^ v (- Sot n) 
i 
for some B’ > 0, uniformly over y < y < cc, and therefore 


Plc-ozxQ «c-—y/c, sup X(t+uA,) >c] 


0<u<al 


(3.23) < yr (c) E Í "e! vy (By Jal? — 2a) dy! 
y 


rofe p(o -Zr n))| 


MAXIMA OF RANDOM FIELDS 97 


Since Y; t; > c, Olt exp(oc — Y; t;/2)) = o(1). By (3.17), (3.21), (3.22) 
and (1), i 


| sup XX &--uA|) - X (0] » o] 


0<u<al 


(3.24) < P| sup Zt(u) > (w/2) exp (X ti n | 
1 i 


0<u<a 


< 44 y (B'cw/a‘!* — 27d) + O G exp (- 5 n) 


for all large c. Let yg = a! and take w > a'/?/B' so that y (B'co/al/? — 
d2'/*) = o(y (c)). Recalling that Y; t; > c^, it follows from (3.23) and (3.24) that 
for all large c and yg < y x c, 


P] sup X(t+uA,) >c, X(t) <c~y/c} 


0<u<al 
oo / 
«y() |. se" (By Ja? — 24a) dy! = yo) Nay), 
y 
with Na(ya) + JYO y! Na(ya + y) dy = o(a”) for all s > Oand p — 0. O 


LEMMA 3.7. Assume (3.4). For y > 0, there exist positive constants B1, B» 
and n such that 
P{X(t) » c— y/c, X(t+uA,) >c—y/c} 
< Bi expC— Bo Jul") yr (c — y /c) 
uniformly over t,t + uA; € [De]s. Hence (A5) holds. 


(3.25) 


PROOF. For exp(t) € Z4 and exp(t -- uA.) € Z5, 


, P{X() - c —y/c, X(t+uA,) >c—y/c} 
(3.26) 

< P(X(t) +X +u.) > 2(c — y/c)). 
in which X(t) + X(t + uA,) is a sum of Tie. max (ei , efi t4iAc) random vari- 
ables, each of the form (exp(— 5^; ti/2)1pc<expity} + exp(— 2j; (ti + ui Ac)/2} x 
Lix<exp(t+ua,)}) Yk. Since 55 f > c, X(t) + X(t + uA.) is a sum of at least 
exp(c?) i.i.d. random variables. Using this and Var(X (t) + X(t+uA,)) — 2(1 + 
p(t, t - uA), we then obtain by the Berry-Esseen theorem that 


P(X (t) + X (t-- uA, > 2(c — y/c)) 


2(c — y/cy zio 3 
suh (aaay) )*0€wcem 


(3.27) 


98 H..P. CHAN AND T. L. LAI 


By (3.2), there exists ¢ > 0 such that 1 — p(t,t + uAc) > Ac yd; |uil/4 if 
Ac lui] < $. Moreover, 1— p(t, t-- v) = 1—exp(— 27; |v; |/2) > £ == 1-e $4 
if 5; |vi| = ¢. Since 


(e m)" 
sw - y/o( PO nee) ps 
E 2 


—(c — y/c)* (1 — p(t, t + 21 


à exp| 2(1 + p(t,t+uA,)) 


2 
—(c— 1— p(t,t A 
«ve -y/c)exn| (c — y/c)' - p(t,tt+u 2] 
it follows from (3.26) and (3.27) that for all large c, 
P{X(t)>c—y/c, Xt -uA) >c—y/c} 


(3.28) | € v(c— y/o) 


-c 
x |ew(- 2 iif52 tz wie) - € ^ 5 NA, y; meo | 


I 
Since t and t + uA, belong to [D.]s and since sup, yerp,}, lv — tl = O(c") 
by (3.4), it follows that Y^; |uj| < d|u| = O(c*^?). Hence (3.25) with 7 < 
min{1,2/(« + 2)) follows from (3.28). L] 


PROOF OF THEOREM 3.1. We have already shown that conditions (C), (A1), 
(A3) and (AA) are satisfied and that (A2) and (A5) also hold under (3.4). Let II, = 
(t-- KA, € Itc, : k € Z^). Note that £;/ Ac — oo and that 


P] sup X (u) > e| 


uc. 


< | P(x) > c- Và 
uel 


(3.29) | 
-+ P|xq <c—I/c, sup X(u+vA,) > 7l 
.0zv-1 
= O((Gc/ AD YO), 
by (3.3) and (A4). By adding up (3.29) over {t € (¢,Z)? : Ic. N De Æ Ø}, it fol- 
lows from (3.5) that P{suptep, X(t) > c} = O(v(D.)Az^ v (c)). By (3.2) and 
Lemma 2.3, H(t) = 474. If (3.4) also holds, then (A1)-(A5) all hold and Corol- 
lary 2.7 can be applied to give (3.6). O 


MAXIMA OF RANDOM FIELDS 99 


LEMMA 3.8. Let B:Z. — (0,00) be nondecreasing and such that B(n) < 
{3d log log |n|}!/2. Define J, by (3.8) and let c(t) = B(\exp(t)|) fort e RZ, c(0) = 
pa). 


(i) If Jo < oo, then Zyro 2k) 9/2 < 00, 

Gi) Let wy = 2 and wj44 = wj + logw; for j > 1. Then wj ^ jlogj as 
j—ooc.Forke 7. define the rectangle 199 = T aw, , Wk, -+1) and let wy = 
(Wk,,--+, Wk). Assume furthermore that p(n) > (dloglog Ini} and Ją = oo for 
some € > 0. Then for every 0 <€ < &', 


(3.30) Y (PPE (uy)e7c (9972 = oo, 
k-3:1(9cG, 


where G is given in (3.4) and v(-) denotes volume of the rectangle. 


PROOF. Note that x2d-1o7x*[2 is decreasing for x > xo. Since f is nonde- 
creasing and there are only finitely many n's with B(n) < xo in parts (i) and (ii) 
of the lemma, we can assume without loss of generality that 824-1 (nye (9/2 is 
decreasing in n. Therefore 


k20 ln:e kit 


ki KA ceg“ 
> >a I 2e yd c?d-1 (le ^ 99/2. 
k>1 


noting that |n|! > [TZ , e~“+ if eki <n; < e+! for all i and that (n:e^ < 
ni < e%*! for all i} has at least [J£ (e¥ +! — e% — 1) elements. A similar argument 
also shows that for any {i}, ..., ij} C {1,...,d}, 


A. pape 
cd 1 (k)e C (k)/2 < co. 
kk =0,k;>1 for iG {i1,...,0;} 


To prove (ii), first consider the case d = 1 for which J, = Jo. Here (3.30) fol- 
lows from Jo = co, mE 


oo 
(3.31) Y^ n^! B(n)e PM? < > c(wp)e* (972 | n l 
nze k=2 exp(wy) xn <exp(wx41) 


and 2 exp(wy) <n<exp(weyt) n S Wk — We t+1= vl) +1. 


100 H. PCHAN AND T. L. LAI 


We next consider the case d > 1. Since there are finitely many n’s belonging 
to Z4 such that logn € Ge \ (Uk: ming>3,1@co, 19), there exists C > 0 such 
that 


S M _ 22 
Jes x in] | gd lne B*(n) /2 


E 
logneG, 


a | E niae 


k>3: J(B cG, Uognel 


which we can bound as in (3.31) to obtain (3.30) if Jy = oo, noting that 


d d 
>> nz M > ni) < | [ (wai — we +1), 
psi 


logne7 (9 i=] vexp(wi;)zni <exp(wy, +1) 


and that [[4_1(wx,41 — wey + 1) ^ v(109) as miniz;zg ki — œ. O 


PROOF OF THEOREM 3.2. Suppose the theorem holds under the addi- 
tional assumption &(n) < {3d loglog|n|}’/*. To show that it also holds with- 
out this additional assumption, define B(n) = = min{B(n), (3dloglog in{)!/2} for 
an arbitrary function B:Z4 — (0,00). If Jo(B) < oo, then Jo(B) < Jo(B) + 
Jo((3d log log |n|}! < oo a hence 


sup{|n|: $5/In|? > 8(n)) < sup(In|: Sp/|n|'/* > B(m)) «oo — as. 


If JA (B) = oo, then TB) = oo, so supí(Ini: Sy/ In|! 7? > B(n)} = OO as. Since 
sup{|n|: Sp/|n|!/2 > (3d loglog |n|)!/7} < oo a.s., it then follows that sup{|n] : 
Sn/|n|'/? > B(n)) = oo a.s. 

Define c(t) as in Lemma 3.8. In view of the preceding paragraph, we shall 
assume that c(t) < (3dlog(Y;; tj))!/? [and hence Y; t; > c° (t) for large t] and 
there is no loss of generality. We can apply Theorem 3.1 to D, = /y | [noting that 
(3.5) clearly holds for such cubes with unit width] and combine the result with 
Lemma 3.8(1) to conclude that if Jp < oo, then 


y P| sup X(t) > c(k) | = o(Y: por age?) « oo, 


k>0 ‘tekk, k>0 


and therefore 3.9 P{0] 71Sa > p(n) for some n with logn € Ik,1} < 00. 
Hence by the Borel-Cantelli lemma, f is an upper class function if Jo < oo. 
Suppose Jj = oo for some A > 0. Take 0 < e < X. To prove that B is a 
lower class function, we can assume that £(n) > (dloglog In] 72, using an ar- 
gument similar to that at the beginning of the proof to show that the assump- 
tion leads to no loss of generality. For notational simplicity, we focus on the - 
case d — 2, as extension of the proof to d > 2 is straightforward and the case 


MAXIMA OF RANDOM FIELDS 101 


d = 1 does not involve multivariate considerations. Define the rectangles 79? as 
in Lemma 3.8(ii) and partition the set {k > 3:7 d) c G,} of bivariate vectors 
k = (kı, k2) into four disjoint sets A1,..., 4 so that kı is odd in A; U A2 
(and even in A3 U Aq) while kz is odd in A; U Az (and even in Az U A4). 
Since the number of k’s belonging to G3 X (Uk>3: ra cc, / (9) is finite and since 


v(I99) = (wy, 41 — wi )(wj, 41 — wi) ~ v(1 7D) as min(k;, k2) — oo, it fol- 
lows from Lemma 3.8(ii) that 575. Erea; v (k-2)) 24-1 (yy yee" (1)/2 = oo, 
and therefore there exists j such that 
(3.32) Y^ v(10-9)e24-1 wp) (99/2 = oo. 

ke; 
Since (n) > (dloglog [n])'?, c(wx) = B(Le"* |) > (d + o(1)) ^ (log [wx "^; 
on the other hand, wi, — wk;—1 = log wk;—1, showing that (3.4) holds with D; = 
IED | c —2 and c = c(wy) + 2/c(wy). Clearly, De = IU also satisfies (3.5), 
so Theorem 3.1(11) can be applied to conclude that 


> P| sup XO > (ux) "-2/c(wy) | 


ken; ‘tere 
puget > v(1&-7) c^ (uy) g € (wx) /2 /(4 V2) aig. 
ke A; 
in view of (3.32). This implies that 
Y. P{Sp/{n|'/? > B(n) + 2/c(wx) for some n with logn e V] 
(3.33)  ke^j 
= OQ. 
Write n < m <n to denote n; < mi X nj for all i. For logn € 7 (k-1) define 
SE = Y ooo n] <men Fm- We shall show that 


(3.34) p» P| sup [Sp — SO? [/In]!? > 1/e(un)| < 00. 
keA;  'logne](-D 


From (3.33) and (3.34), it follows that 5 ke Aj P(F,) = oo, where 


Fk = (S? /In|V? > (n) + 1/c(w x) for some n with logn e 179]. 


Since the Fk are independent events, P {Fk 1.0.) = 1 by the converse of the Borel- 
Cantelli lemma. Applying the Borel-Cantelli lemma to (3.34) and combining it 
with P(F; i.o.) = 1 then show that P {Sn/|n|} > B(n) i.o.) = 1. To prove (3.34), 
let v = |exp(wy-2)] and note that 


S-S YO Yat So Ym— >) Ym 


mi Eni m <2 mi <v msn m<v 


102 H. P. CHAN AND T. L. LAI 


We shall show that 


y 4 


m; xn; m <v 





(3.35) x. P| sup 


ker; Uogner&) 





/ In]? > 1 [Coton < oo. 


Observe that Fm, <n, v; Ym/|n| ^ = X (log m1, wi, 2) (v2/n2)!? and v2/n2 < 
EXP(Wk 2 — wp 1) = Wi > and that for large k € Aj, wj - A / Gc(ux) > 
(3d log |wy|)!/?. Therefore 


P | sup 
logne] *-D 


«| sup X (9I > Gd og lux D"? ] = O(a +) 


Wk--2 <t<wy, 


2, Ym 


mı Eni, MS 








Jm? > Gen) 


for some A > 2, by Theorem 3.1(i). Since J (k-D c G,, this proves (3.35). LJ 


4. Other applications. Section 2 provides a set of general conditions under 
which the asymptotic boundary crossing density approximation in Theorem 2.8 is 
shown to be valid. Given a specific application, one needs only to verify that these 
assumptions are satisfied. In particular, such verification has been carried out for 
sums of 1.i.d. random variables with multidimensional indices in Section 3, and we 
begin this section by carrying out similar verification of (C) and (B1)-(B5) for mul- 
tivariate empirical processes. Let Y1, Y2,... bei.i.d. d-dimensional random vectors 
with common distribution function F, and let Fp (t) =n! Y7 4 hy; <t) t € Ra, be 
the empirical distribution function of Y1,..., Yn. Let Za (£) = /n{Fy(t) — F(t)) 
be the multivariate empirical process. The limiting distribution of Z, is that of a 
Gaussian sheet Z9, for which Adler and Brown [2] proved that 


(4.1) Ka pc M De 72e < P [sup Z°(t) > ct < Ka De, 
t 


where Kg is a constant depending only on d and K4, r is a constant depending on 
both d and the distribution F. For the case d — 2 with independent components 
Y; ; and Y; 2 of Yj, Z? isa pinned Brownian sheet, for which Hogan and Siegmund 
[20] sharpened (4.1) into 


(4.2) P [sup Z(t) > ef ~ (4log ee? as c — oo. 
f 


In Section 5, we apply Corollary 2.9 to prove that if the sample size ne increases 
to co with c such that c = o(n/ 2t then we can replace ZÜ(t) in (4.2) by Zn, (t) 
and also extend the result to general d and general distribution F such that 


(4.3) F is continuously differentiable and dF/dt; > 0 fori «id. 


MAXIMA OF RANDOM FIELDS 103 


In view of (4.3), we can apply a change of variables t — F(t) and assume that 
. F is a distribution function on the bounded Jordan measurable set [0, 114, agreeing 
with the assumptions in Corollary 2.9, whose notation (such as vg) we use in the 
following theorem. 


THEOREM 4.1. Let M = (t: F(t) = 5} and assume that (4.3) holds and c = 
o(ni/®), Then as c > oo, 


P [sup Zn, (1) > cf 
(4.4) ; 


d 4 
T OF 
~ (8:247 1972c* | IV F7! [T Z— (0 va-i(45. 
M in OF 


(4.5) P {sup |Zne( s c| 


| d ƏF 
281717? | ivre T 7-0 win. 
i=] 4 


COROLLARY 4.2. D For d = l and continuous distribution function F, 
P{sup, Zn) > c) v ec as c — OO. 

(ii) For d = 2, if F(t), t2) = Fi (t1) F2(t2) with continuous univariate distribu- 
tion functions F and Fy, then P (sup, Zn,(t) > c) ~ (41og2)c?e 7^ as c > oo. 


PROOF. (i) We can assume without loss of generality that Y; is uniform on 
(0, 1). In this case M = {5} and the integral in (4.4) is 1. 

(u) Without loss of generality, assume dl Fi(tj) = t1 and Fo(t2) =b, 0 < 
tj, h < 1. In this case M = ((u1, u2) :u1u2 = 5, 0 < u1, u2 < 1} and the integral 
in (4.4) becomes 


J M vi(d u) = f, — Ce ( tis) du 
A (ut + u3)! 1/2 [u? + (1/(2u1))?]!/2 Aut 
1 1 1 
Em ——dd = 5 lo 2. 
1/2 2u1 “l $ 


completing the proof for the case d — 2. O 


For d = 1, Smirnov [33] has shown that P (sup, Z9(t) » c) — e72 for all c > 0 
and Corollary 4.2(1) yields a corresponding asymptotic formula for Z,,, which was 
used by Chung [12] to prove an upper-lower class theorem for the Kolmogorov— 
Smirnov statistic. Note that (4.5) says that for some constant «3, P (sup, |Z,(t)| > 


An} ~ Kahne Deh if Ay — 00 and A, = o(n//$), Since n(F,(f) — F(t)} is 


104 H. P. CHAN AND T. L. LAI 


a partial sum of the empirical processes 1y,<;; — F(t), we can apply (4.5) and 
follow Chung's arguments to prove the following. 


COROLLARY 4.3. Let X, be a nondecreasing sequence such that A, — oo 
and let F be a distribution function on RÊ satisfying (4.3). Then P (sup, |Zn(t)| > 


An 1o.) = 0 (or 1) if YL; n-1324e 7*1 < oo (or = oo). 


An important difference between our proof of the upper-lower class result in 
Corollary 4.3 and that of Adler and Brown [2] is that they first develop their results 
for the limiting Kiefer processes and then use the strong approximation theorem 
of Dudley and Philipp [14] whereas our approach works directly for the empirical 
process (and of course also for the limiting Kiefer process). The strong approxima- 
tion approach involves embedding the given process in Brownian motion for which 
the integral test can be readily shown to hold by using, for example, the tangent 
approximation (1.1) to the boundary-crossing probability. For partial sums of sta- 
tionary sequences having long-range dependence, the limiting process is no longer 
Brownian motion and strong approximation along the lines of Komlós, Major and 
Tusnady [23], Philipp and Stout [27] and Berkes and Philipp [5] is no longer ap- 
plicable unless one imposes very restrictive assumptions that are described in the 
next paragraph. However, the theory in Section 2 can still be applied. 

In particular, as in [13] and [19], consider partial sums S,, :— ^1 Y; of linear 
processes Y; := $55. o tj—;6; where ej are i.i.d. random variables with mean 0, 
variance 1, Ee!l*!! < oo for some t > 0 and {t4}? satisfies °°, v2 < oo. The 
sequence Y; is said to have long-range dependence if E(Y1Y,41) ~ n*—? L (n) for 
some 1 « a < 2 and slowly varying L so that 


(4.6) o? (n) :— Var($,) ~ 21a (a — 1))- 1 n* L(n). 


Defining Z,(-) by linear interpolation with Z, (t) = S;,/o(n) for t = o?(k)/o? (n), 
Davydov [13] has shown that Z, converges weakly to a zero-mean Gaussian 
process (with correlated increments) whose covariance function is the same as 
that in (2.1) with d = 1 and r; =1. Although strong approximation theorems are 
not available for such S,, Chan and Lai [9] have been able to derive integral tests 
of the type (1.3) for upper-lower class boundaries of S, in the long-range depen- 
dent case by showing that assumptions (C) and (A1)-(A5) of Corollary 2.7 are 
satisfied by Xe) (= X(t)) = Sia /o(Le' ]) and De = [t5, t*] with c = o(e/9), 
(t? — te)/ c^/* — oo but ts — t; = O(c*) for some x > 0. Hence application of 
Corollary 2.7 yields an analog of Theorem 4.1 and therefore also the law of the it- 
erated logarithm (LIL) for partial sums of long-range dependent linear processes; 
see [9]. Wang, Lin and Gulati [35] recently derived the LIL by using a strong ap- 
proximation approach that requires v; to have the special form tg ~ k-P L(k) as 
k — oo for some 5 < B < 1 and zy — 0 fork < 0. 


MAXIMA OF RANDOM FIELDS 105 


Example 2.2 provides a prototypical example in change-point and signal detec- 
tion problems, and Theorems 2.4, 2.5, 2.8 and their corollaries can again be applied 
to a variety of generalizations of Example 2.2 for these applications. Suppose we 
replace the Brownian motion W(t) by a Gaussian field X (t) and to — t; 1s replaced 
by Var(X (t2) — X(t1)); see [1] and [7]; Again conditions (C) and (A1)-(A5) can 
be shown to hold for these applications and also for their discrete-time analogues 
(like Sn in Section 3); see [10]. 

Suppose we replace the Brownian motion W(t) in Example 2.2 by a sample 
sum process Sinet] where Sa = Y, +---+ Yn and the Y; are i.d. with mean 0, 
variance 1 and Ee!!'1l < oo for some t > 0. Then instead of X, we now have a 
random field X. defined on D such that | 


X, (n/nc, n[nc) = (Sn — Sm)/(n — m) ^ 
for m <n € an, with aine € n — m € ane. 


The stopping time T; = inf(n: maxn—ayn,<m<n—ayne(Sn — Sm)/(n — m)? > c] 
has important applications in sequential change-point detection. Assuming that 
n./c® — oo as c — oo and making use of moderate deviations theory, Chan and 
Lai [11] have shown that X, satisfies conditions (C) and (A1)-(A5). Therefore, 
analogous to (2.10), 


(4.7) P(T, < ane} ~ v (c) (c*/4)la(a; ! — a7’) —1og(a2/a1)]. 


This result provides an important tool for the choice of the threshold c and win- 
dow sizes of the detection rule 7; to ensure a prescribed false detection rate; see 
[24] and [11] where the asymptotic optimality (in the sense of quickest detection 
delay) and extensions (to multivariate Y; and Markovian Y;) of T, are also given. 


5. Proofs of Theorems 2.1, 2.4, 2.5, 2.8, 4.1 and their corollaries. To study 
the asymptotic distribution (as f — oo) of supo, -, X (s) of a stationary Gaussian 
process with EX (s) = 0, Pickands [28] introduced a method, which has under- ` 
gone subsequent refinements and is now commonly known as the method of dou- 
ble sums (cf. Chapter 2 of [29], [30, 31]), to derive the asymptotic behavior of 
P{supy<,<; X(s) > c] as c — oo. In this section, we modify the double sum 
method for non-Gaussian fields, to which powerful tools like Slepian's inequal- 
ity and Fernique's theorem for the Gaussian case (cf. [29]) are no longer applica- 
ble. In particular, unlike the traditional double sum 22709 j Pisup,ejo X (u) > 
C, SUP eza) X (v) > c) that is shown to be negligible relative to the single sum 
> Písup,e 0 X(u) > c] for stationary isotropic Gaussian fields (cf. [28, 29]), 
note that (2.12) involves P[sup,e yo Xc(u) > c, supyepgy yo Xc(v) > c} instead. 

The proof of Corollary 2.6 (or 2.7) involves covering D (or De) by cubes 
of the form J; xa, and using a discrete approximation A;(— A;(a,m, c)) :— 
(t + kaAc:0 x kj < m,k € Z^) of lKa. where a = K/m. To distinguish 
from the scalar K = ma, we shall use k to denote the elements of Z^. Since 


106 H. P. CHAN AND T. L. LAI 


P(supye,, aa, Xe) > c, Xe(u) < € — y/c} < Na(y)v (o) by (A4), approximat- 
ing the tail probability of sup,<¢;, pa. Xc(u) by that of suppea, X¢(u) has the error 
bounds I 


0< P] sup X-(u) > c = P| sup Xe(u) > ef /vco 


HEL; KAS u€Á, 


< Ge —y/c < sup Xe(u) < c 
UGA; 


(5.1) 
t P| sup Xe(v) >c, Xu) xc — vie} |/vo 


ucA; VElu,aAc 


< P|e- vie < sup Xe(u) < c} / v + (K/a Nat), 
UECAt 

uniformly for t € [D]s and ya < y < c. The proof of Theorem 2.4 makes use 

of (5.1) and Lemma 5.1. Theorem 2.5 is introduced to provide a building block to 

handle nonstationary random fields (or nonconstant boundaries) in Corollary 2.6 

(or Theorem 2.8), which can be proved by much easier arguments in the case of 

stationary random fields; see Remark 5.1. 


LEMMA 5.1. Under (C) and (A1)-(A3), 
oo 
Hx a(t) ef P| sup - W; (ak) > y} dy 
0 Oxk; «m 


is uniformly continuous in t € [D]s and SUp;erpy; Hx a(t) < oo. Moreover, for 
y > 0, as c -> oo, 


P] sup Xe (u) >c — v/e] 


uc, 


~ V (c — y/o + Hi.a(1)] uniformly for t € [D]s. 


(5.2) 


PROOF. Lete > 0. By (A3), there exists y* > y such that h(y*) < €/m? and 


0< P| sup Xe(u) > c — v/e] 


HEA; 


= P| sup X,(u) > c,c— y*/c < X(t) <c— v/c} 
(5.3) HEÁr | 
— P] sup X-(u) >c, Xc(t) <c — y*/e| 


u EA; 


< m^ h(y*)y (c) « ev (c), 


MAXIMA OF RANDOM FIELDS 107 


since |A;| =m. By (AL), there exists £; — 0 such that 
(5.4) |P{X-(t) > c — y/c (c — y/c) — 1] = OE?) 


uniformly for y < y < y*; we can also assume that £7! (y* — y) € Z. Since e5c = 
1 + £, + O (&2) and y (c — y/c) ~ e! yr (c), (5.4) implies that 


Pic — (y + &)/c < Xc(t) < e — y/c] 


— (14- O(€2))e? **c y (e) — (1+ O(2))e? y (c) ~ Eee” vr (c). 
By (A2), uniformly for t € [D]; and y < y x y*, 


(5.5) 


P | sup Xcu) >c,e— o- Efe < X(f) <e — y/c] 
ucA, 


(5.6) 
~ P| sp Wi(ak) > y| Ple- O + 8)/e < Xel) xc y/o). 
0k; «m 

Applying (5.5) to (5.6) and summing (5.6) over y = j& + y for j =0,1,..., 
Ju l(y* — y) — 1, we obtain (5.2) from (5.3) with arbitrarily small £. Since 
Jo e P{W; (ak) > y) dy < oo for all k and A; is a finite set, Hx a(t) is finite and 
its uniform continuity follows from (2.1) and (2.2), with the convergence in (2.2) 
being uniform in ? € [D]s [see the sentence describing assumption (A2)]. Re- 
call in this connection that sup,«rpy, vega-1 'r(v) < 00, yielding the finiteness of 
SUPrerp], Hg a(t). [] 


PROOF OF THEOREM 2.4. Let a> 0. By (A4), (5.1) and (5.2), we have for 
all large c, 


0< [P] sup Xe(u) > c| - P| sup X-(u) > e| /vco 


Hel; K ^c UGA; 


< 2(e” — 1)[1 + Hg a(t)] + (K /a)d Na (y). 


By (A4), for any € > 0, we can choose a* small enough such that N,(y,)/a? < 
e/K®@ and 2(eY« — 1) < e for all 0 < a < a*. Therefore, by (5.2) and (5.7), 


(1 — e)(1-- Hx a(t)) 


(5.8) < P| sup X,(u) > c} / vo) 


UC K Ac 


(5.7) 


< (1 4- 26)(1 + Hx,a(t)) +e, 


for all large c and all t € [D]s and 0 < a < a*. We shall restrict a and a* to 
(27/:j =1,2,...} so that the integrand of Hx q(t) is monotone in a and increases 
to the integrand of Hx (t) as a | 0. Hence by the monotone convergence theorem, 
Hx a(t) — Hx(t) as a — 0. Therefore 


(5.9) 1+ Hg a(t) <14+ Hg(t) € (13 - )(1 + Hg a (0)) +e, 


108 H. P. CHAN AND T. L. LAI 


for all t € [D]; and 0 <a x a* (with a, a* e (277: j = 1,2,...]). We shall use 
(5.8) and (5.9) in conjunction with Lemma 5.1 to derive the desired conclusions of 
the theorem. 

First note that M :— 1 + sup;erp), Hk(f) < oo in view of (5.9) and Lemma 5.1 
and therefore 


(5.10) |Hg (t) — Hg a*(t)| x (M + De for all t € [D]s, 
by (5.9) with a = a*. Because Hy a* is uniformly continuous by Lemma 5.1, 
(5.1) — |Hx,»()-—Hx(w)zes — if t—ul] <8*, t, u €[D]s, 


for some 6* > 0. Since |Hx(t) — Hx(u)| € |Ax(t) — Hg a (t) + |Hg (u) — 
Hg a* (u)| + | HK a* (t) — Hg „a+ (u) |, it follows from (5.10) and (5.11) that | Hx (t) — 
Hy (u)| x 2(M + De + eif llt — ul] < 6*. As € is arbitrary, this shows that Hx is 
uniformly continuous. Combining (5.8) with (5.10) and the definition of M yields 
that for all large c and t € [D]s, 


—e — eM — &*(M + 1) < (Ye)! Pt sup Xe(u) > cf — (1+ Hg (t) 


UEli Kk Ac 
«€ 4- 26M +267(M +1). 


Since € is arbitrary, this proves Theorem 2.4. |] 


LEMMA 5.2. Under (C) and (A1)-(A4), sup;etpy, k»1 K^ Hk (t) < oo and 
(K"Hg:K > 1} is uniformly equicontinuous on  [D]s, that is, 
SUP K>1,1,s€[Dly, It—s<e |K 4 Hk) — K 4 Hx (s)| > 0 as e — 0. 


PROOF. Without loss of generality we can restrict K to be integers. Take any 
positive integer a^!. Note that the integrand of Hg a(t) involves the set {ak:0 < 
ki < K Ja}, which can be partitioned into K^ disjoint subsets L j such that |L j| = 
a^. We can therefore use the arguments at the end of the proof of Lemma 5.1 to 
bound 





Kk? 
Kc 
j=l 


Hi 
-a 
— 


P] sup W, (ak) > » — P| sup W,(ak) > y} 
keL; keL; 


and thereby establish the uniform equicontinuity and boundedness of (K ^4 H K.a: 
K > 1} on [D]s. Moreover, by partitioning the cube [0, K) similarly into K^? 
unit cubes, it can be shown that supxs1 rejp; IK 4 Hk (t) — K 7 A, a(t)| > 0 as 
a — 0. Hence we can proceed as in (5.10) and (5.11) but with Hg, a and Hx re- 
placed by K ^7 Hx a and K 7 Hy to prove the uniform equicontinuity and bound- 
edness of (K& Hg: K 21). O 


MAXIMA OF RANDOM FIELDS 109 


LEMMA 5.3. Under (C) and (A1)-(A5), there exist constants sx — 0 as 
K — oo such that 


(5.12) Pt sup X-(u)>c, sup X,-(v)> ct < sk Ky (c) 
uel; Ac v€BA KA, 


for c large enough, uniformly over t € [D]s and over subsets B of [D]s with 
bounded volume. 


PROOF. Leta >Oand0<gq < p. Then 
Ga:= Y, exp(ülwll^) fw « oo. 
we(aZ)4 
Let m, n be positive integers that are large enough such that 
$5 — exp(lwl?) f (lw) < ea^ /8 
w€(aZy! ,|w|j7na 


and 
[1 — (1 —2n/m)?] < ea /8G,. 


Let K = ma, Fi; = (t + kaAc:in < ki <m—n,kéZ}, For =A; \ Fin, 
B, = {t +akAc € B\ h kA: ke Z4), guy = min(c — ya, (lv — ul /A0)1}. Then 
by (A5), 


P{X,(u) > c — (Va + Bu,v,c)/C, Xc(v) > € — (ya + Bu,v,c)/C} 
(5.13) = V (c — (Ya + guv)/c) f (lu — v|l/Ac) 


< 2e8 yy (c — a/c) f (lu — v|l)/ Ac), 
for all large c and small a. For u € Fis and v € B;, lu — vl/A, > na and guy < 
(llu — v||/Ac)4. Noting that | F;,;| € m2, | F»,;| x m^ — (m — 2n) = m^[1 — (1 — 
2n/ m)" ], and that Y^, c4, = 175.1 Luer, We obtain from (5.13) that for all large 
c and small a, | 


>. > P(Xc(u) > c — (Ya + 8uv)}/€, Xc(v) > € — (Ya + Buv)/c} 


ucA, vEB; 


< Wc — ya/c)m* 
(5.14) 


«| L exp) fl) (1 (1 — 22/m/1G. | 


we(aZ)4 ,|jwlzna 


< (eK? /2)y (c — ya/c). 


110 H. P. CHAN AND T. L. LAI 
Define Aw = minyeA, Zuw if w € Br, and Ay = 0 if w € A;. Then 


P| sup X,({u)>c, sup Ke(v) > c| 


uel K Ac vEB\ It KA: 


(545) x 5 D> P{X-(u) » c — (a + 8uv)/c, Xc(v) > c — (Ya + Buv)/c} 


ucA; vcB, 


+ YD P| sup Xe) >c,Xe(w) <c (Ya + Aw)/e}, 


weA;UB, ZElw,ahe 


On the right-hand side of (5.15), the first sum can be bounded by (5.14) and the 
second sum by 


Y P| sup X.) > c, X à) <c — ya/e| 


u€Á, ZEly adc 
(5.16) + > P| sup Xc(z) >c, Xc(v) x e — (ya ye] 
v€ B, Z€lyaAc 
< (K/a) Na (ya) (e) + Y^ Na(ya + Ay) (c), 


v€B, 


in view of (A4) and that |A;| = m? = (K/a)?. To bound the last sum „e B 
in (5.16), first consider the case d = 1. Since A, > min{(ak)?,c — ya} if akA; < 
inf,;cA, |v — u| < a(k + 1) A., and since Ng is nonincreasing, it follows that 


> Na ya + Av) W (c) 


vcB, 


(5.17) «ats + (ak) + Nalon(B)/(ao| 
k=] 


< |a Í ” Na(ya y dy + v(B)Na(o)/(ad.)}. 


Integration by parts shows that the integral in (5.17) approaches 0 as a — 0, since 
Na(ya) + fr w’ Na(VYa + w) dw = o(a) for s > (qu) — 1)*. Moreover, in view 
of (2.5), Na(c)/Ac = Oro, w’ Na(va + w) dw) = o(a) as a > 0 and c > oo, 


c 


for s > 2/a. Therefore, > yep, Na(ya + Av) < €/4 for all large c and small a. In 
general, for d > 1, 


b» Na(ya + Av) 


veB, 


(5.18) < ila 3,0 K +2) Na(Ya + (aj?) + Sd 
j=l 


<eK?!/4 for all large c and small a, 


MAXIMA OF RANDOM FIELDS 111 


as can be shown by arguments similar to those in the case d — 1. Combining (5.15) 
with (5.14) and (5.16)-(5.18) yields the desired conclusion. L] 


PROOF OF THEOREM 2.5. Let 1 >œ > 0. There exists K* such that sg < &/3 
for all K > K*. For fixed t € D and K > K™, define 


A = {u € (K AZ)! : Iu k^, C Int]. 
(5.39  . : 
A= {u € (KAcZ) uk Ac Teu, £ gl. Ju = Iu, Ka. 


Covering I; 2.a, by cubes of length K A, and letting B be a subset of [D]; con- 
taining 7; e, A, and such that v(B) < vo, we have 


` LE sup X-(v) > c| — P] sup Xc(v) >c, sup Xe(w)> 2l 


ucA v€J, v€J, wEB\Ju 


(5.20) 
< P| sup Xe(v)> c <)> P| sup X-(v) > c). 


VEL tL AC ucA ved, 
By Theorem 2.4 and Lemma 5.3, as c — co, 


(1+0(1))W(c) Y 2 LH (0) — sk K^] 
ucA 


(5.21) 
<P} sup X0) >c} < (1--00))6(O D Hx), 


vel — 
Edr lc Ac ucA 


uniformly in t € D. In view of cAc — 0 and the uniform equicontinuity in 
Lemma 5.2, we can choose c* large enough so that |K 7 Hx (u) — K Hx(t)| € 
e/3forallc > c*, Vle > K > K*,t € D andu € A (= A(t; K Ac, £t Ac)). Putting 
this and the bound sx < ¢/3 in (5.21) and dividing (5.21) by ły (c), we obtain 
for all c > c*, Vle > K > K* andt € D, 


(1— &)tK ^? Hy (t) — 2e/3} < P| sup X;,(v) > ct / dvo) 
VET, f Ac 


(5.22) 
< (1 2- e)(K 7 Hy (t) + €/3}, 


since |A| ~ |A] ~ (£,/ KY*. By Lemma 5.2, M :— SUDiep p], K>] K~4 Hy (t) « oo. 
Therefore, it follows from (5.22) that 


(5.23) sup 
teD 





P| sup X,(v) > ct / d) -— kno) « eM -- 26/3, 


VEL; t. Ac 

for all c > c* and /2, > K > K*. Letting c > co in (5.23) yields 
sup|K 7 Hy (t) - K ^ He(t)| x 26M + 42/3, 
teD 


112 H. P. CHAN AND T. L. LAI 


: K > K* and K > K* , establishing that (K 7 Hx} is uniformly Cauchy. Hence 
74 Hy (t) converges uniformly in t € D to H(t), which is also bounded by M. 
» can therefore proceed as in the second paragraph of the proof of Theorem 2.4 
to show that H(t) is uniformly continuous in t € D. Moreover, taking K large 
enough such that sup,cp | K "d Hy (1) — H (t)| < e/3, it follows from (5.23) that 


Sup 
teD 





P| sup Kev) > ef / tdv) - HO « e(M +1) 


vel; Kc Ac 


for all c > c*, proving (2.11). 
We next show that inf;<~p H(t) > 0. For the function f in (A5), we can choose 
a > 0 large enough so that ? 9 f (ak) x 1/2. Let K = ma and ue A, = 


A;(a, m, c) as in the paragraph preceding pem 5.1 so that |A;| = m4. Then by 
(A1) and (A5), as c — oo, 


P| sup Xc(u) > c] 


uc, 


Lx | PU 0 >c}— », P{X-(u) >c, Xv) > J 
HCÁ, UCÁ,,vÉu 


(5.24) 
> $ (1-o0)v(o9/2 
ucÁ, 
= (1--o0))m^v (c)/2, 
uniformly in t € D and m > 2. Combining (5.24) with Theorem 2.4 yields 
L+ Hma(t) = lim (WP| sup Xe(u) > c] 


HEL; mac 


> limsup(y (c) ” ‘P| sur sup Xe(u) > JE > m? /2 


u cA, 


for all m > 2 and t € D. Since limy o5 K 7 Hy (t) = H(t), it then follows that 
H(t) za 4 /2 for all t € D. Finally, to prove (2.12), apply (5.12) to obtain that for 
all t € D and large c, 


p] sup X-(u)>c, sup Ke(v) > c] 


UG tA. ve BM, leche 


< 25 P | sup Xc(v) > c, ED Xc(v) > c| 
HEN v€J, EB\ Ju 


< [Alsk K^ (c). 
Since sg — 0 as K — oo and [A] ~ (£./ K)f as £,/ K — œ, (2.12) follows. O 


MAXIMA OF RANDOM FIELDS 113 


PROOF OF COROLLARY 2.6. A basic idea of the proof is to cover the 
bounded, Jordan measurable set D by cubes of length £, Ac, with £e — oo such 
that £, A; — 0. Define A, A and J, as in (5.19) but with (K A,Z)@ replaced by 
(£c A c£), Lk A, by Ine, A, and Irea, by D. Then (5.20) still holds with these 
new definitions of A, A and J, and also with B replaced by the bounded set [D];. 
Labeling it as (5.20^), the upper and lower bounds in (5.20^) are both asymptoti- 
cally equivalent to (EA JE EE V (c) fp H (t) dt by Theorem 2.5, since £, A; — 0 
and A(t) is continuous. O 


REMARK 5.1. Corollary 2.6 can be proved by easier arguments, to be 
sketched below, when X,(t) = X (t) is stationary. Let 
Aj — (te D:t € (JAG), 


Iu | sup Xa) >c}, 


ucD 


R=] sup X) » c — y/e}, 
ucA, (a, K /a,c) 


&-| sup X) » c v/e]. 
u€ A, (à, K /à,c) 




















Then 
Y; P(A) - Y PË) 
teAxK teAg 
(5.25) 
<| Y; P(A) -PF +) Y PD — PEF). 
teAx teAg 


It can be shown by arguments similar to those in the proof of Lemmas 2.3 and 2.4 
of [31] that limsup,_,.9| Crea, PUR) — PQP)I/(r (c) A77) — 0 as K > oo, 
à — 0 and y — 0. Moreover, by Lemma 5.1 and stationarity, 


(5.26) >> PUR) ~ u(D)(K Ac) y (c — y/o) + Hia), 
teAK 


and a similar relation also holds for 5 ", Ag P(F,). Hence by (5.25), 
(527 |K~4Hxa—K ^Hg;|—0 asK,K—>0o,a,4->0, 


which implies that limx-.o0,4>09 K -d H K,a exists by the Cauchy convergence 
property, yielding H as the limit. For nonstationary random fields, we do not 
have the simple relation (5.26) and cannot show the existence of the limit of 
KH K,a(t) via Cauchy convergence as in (5.27). This is why more complicated 


114 H. P. CHAN AND T. L. LAI 


arguments are needed in the proofs of Lemma 5.3 and Theorem 2.5, from which 
Corollary 2.6 follows. Concerning the proofs of Theorems 2.4 and 2.5, since Hg a 
and H are defined by the Gaussian processes W, rather than the process Xe sat- 
isfying (A1)-(A5), one may wonder why these assumptions have been involved 
in their proofs (and also that of Lemma 5.1) to establish continuity and bound- 
edness properties of Hx a and H. It turns out that for a Gaussian random field 
X, = X whose covariance function satisfies condition (C), assumptions (A1)-(A5) 
also hold with W, being the limiting process in (A2); see the following proof of 
Theorem 2.1 which generalizes the Qualls-Watanabe result (2.6) to nonstationary 
Gaussian fields. 


PROOF OF COROLLARY 2.7. Here we modify (5.19) into 
A, = fu € (22^ : Iute C Deh, 
(5.28) Ne = fu € (£2) : 1,4, De =Ø}, 
Ju = lu, 


and replace B in (5.20) by [D,]s so that we have here a corresponding version 
of (5.20), labeled as (5.20*). Apply (2.12) with £e = £c/ Ac together with (2.14) 
and (5.20*) to derive (2.15). C 


PROOF OF THEOREM 2.8. It follows from (2.11) and (2.16) that 
(5.29) P{X,(u) > b.(u) for some u € I; c) ~ (Co/ Any) V (be (0) HO. 


From (2.12) with the boundary c replaced by 0. (¢) and with £, = ¢,/Ac¢, it follows 
that 


P{X-(u) > be(u), Xc(v) > max(bc(v), b,(t)) 


. for some u € Il; zv € B \ Ire} 
(5.30) 


< P| sup Xe(u) > b,(t), sup Xe(v) > Be(O} 


HEL; te ve BM, te 


= o((Ec/ ^u. c)" Vr A) 


uniformly over t € D, and over subsets B of [D]; with bounded volume. Then 


2 [PU » be(w) for some w € Jy} 


uEAe 


= > P[Xc(w) > be(w), Xe(z) > marbb) 


VEA, VÉU 


(5.31) for some w € Ju, ze Zi 


MAXIMA OF RANDOM FIELDS 115 


< P{X,(u) > b,(u) for some u € De} 
< > P(X.(w) > be(w) for some w e Ju}, 
u€Ac 
in which A, and A, are defined by (5.28). By (5.29) and (5.30), the lower and 


upper bounds in (5.31) are asymptotically equal. Since £, — 0, the desired con- 
clusion then follows. Li 


PROOF OF COROLLARY 2.9. .Since a < 2, there exists € > 0 such that A, = 
o(c- € *9), We can therefore choose e — 0 and £e — 0 such that 


(5.32) G/Ac— 00, | &mc loge bebe =0(c~), 


so £c = o(€,). Consider the tubular neighborhood Ug, of ..M. For sufficiently 
small £, the elements of Us can be uniquely represented in the form x + y with 
x€ M, y € TM+(y) and [yl < £. Since Vb(t) = 0 for all t belonging to the 
compact set M, there exists B > 0 such that || Vb(u)|| < B&. for all u € [Uz.]2;,. 
Combining this with (5.32) yields 


sup b() - inf bU) = Olek) = o(c ^) 


uel; te 
(5.33) uniformly over t € [Ug ]az.. 
Recalling that b. (u) = cb(u) and applying the identity y? — x? = (y — x)(y + x), 
we can conclude from (5.33) that SUD; civ, p. [b (t) — b? (1)] = o(1). 


Let y € M and z € T M^ (y). Then b(y + z) = bp + z/V?b(y)z/2 + O (llz|). 
Applying Theorem 2.8 to De = Ue, yields 


P(X.(t) > cb(t) for some t € Us] 


| ~ ard f, V (cb(t)) O H(t) dt 
(5.34) ý 
~ Az y (chp be!” 


x J H (y) | exp(—c*bpz'V*b(y)z/2) dzv,_ (dy). 
M zeT M4 (y), lizi x &c 


Since V? b(y) is positive definite, infye D\U;, bU) = bp + B’ &? for some B’ > 0. 
Hence by Theorem 2.8, 
P{X_-(t) > cb(t) for some t € D \ Ug,} 


(5.35) < P| sup X,(t) > c(bp + se? 
t€ DXNUs. 


— o(A; ^ v (cbp)), 


116 H. P. CHAN AND T. L. LAI 


in view of (5.32). Combining (5.34) with (5.35) and evaluating the inner integral 
in (5.34) give (2.19). 0O 


PROOF OF THEOREM 2.1. We shall show that (A1)-(A5) hold for the 
Gaussian random field X satisfying (2.7), which is the same as condition (C) in 
the present case X, — X, and hence Theorem 2.1 follows from Theorem 2.5 and 
Corollary 2.6. In particular, (A1) follows from the well-known asymptotic tail be- 
havior of a normal distribution. Let p(t, u) = E[X (t) X (u)]. Since the conditional 
distribution of X (t + uA.) given X (t) is normal with mean p(t, t --u Ac)X (t), it 
follows from (2.7) that as c — oo, 


Elc[X (t --u Ac) — X(t)) X E) =c — y/c} 
(5.36) = —c[1 — p(t, t tuA,)](c — y/o) 
— —|jul|* r (u/lul)/2, 
Cov(c[X (t + u Ac) — X (6)], c[X (t + v Ac) — X(01IX (D =c — y/c} 
=c*[p(t +uAc,t 4- vAc) — p(t t+ vA.) p(t, t --uAc)] 
—> [—llv — ul*ri((v — u)/lv — ull) 
+ lvi reQo/lvID + Hull re @/ 1u11/2. 


Since (c[ X (t + ak A.) — X(t)]:0 x k; < m} is multivariate normal, (A2) then fol- 
lows. Let y > 0. Since yr(c — z/c) ~ e*v (c) for all z > 0 and there exist constants 
B, B' > 0 such that P(W,(u) > z — y) < Bexp(—B'z?), it follows from (5.36) 
and (5.37) that as c > oo, 


P{X(t+uA,) >c—y/e, X(t) «c — y/c} 
CO 
<(140())W©) | e POL) >z- y)dz 
y 
< v (c)hCy), 
where h(y) — 0 as y — oo, establishing (A3). To show that (A5) holds, note that 


P{X(t)>c,X(t+uA,) >c} 
< P{X(t) + X(t --u AC) > 2c} 


(naa! ) 


(5.37) 


lH 1 + p(t t uA) P e 2? 

-veo( 2 ) a-a TT 
a ae 

< vc) exp| (PE eee). 


MAXIMA OF RANDOM FIELDS 117 


By (2.7), there exists 7 > 0 such that c^[1 — p(t, t + u^c)] = nlull* Lul) for 
all t,t + uA, € [D]s. It then follows from (5.37) that (A5) holds with f(u) = 
B, exp( —u^) with 0 < A <a, for some B; > 0. 

To prove (A4), we use a technique of Fernique [16]. Leta > 0, 0 « £ <a, 
1 <é «25/7, k= $0.05 ' and w, = §~" /2«. Define 


B, = {t +k2 aA. :0 < kj <2", ki € Z}, 


F-| sup xo) > ef, 


Ul, aAc 
(5.38) 
Ej ={X(t) <c—y/c}, 
E, = [sup X(v) <c- y( uo = -++ = w/e] for r > 0, 
ve B, i 


recalling that $ 7-9 w, = 7 Note that B, C Br+1 C Ir54, and that by the conti- 
nuity of X, P(F A E-1) < V2 P(E, 1 N Ez). Moreover, 


P(E,- Ec) 
(65.39) <2+4 sup P(X(v))«c—y(1—wg—---— w, 1)/c, 


Vl, gal, 
e€(0,1)7X(0) 


X(v+e2 aA.) -c—y(1—wo—--::— w,)/c). 


Given X (v) = c — y/c, the conditional distribution of c[X (v + 2" a Ac) — X (v)] 
is normal with mean —c(c — y/c)[1 — p(v, v 4-52^* a ^c)] < 0 and variance c?[1 — 
p? (v, v + £27" a A,)], which is bounded by B(a2~")> for some B > 0, in view 
of (2.7). Hence 


| sup c[X(v 4-22 'aAc) — X(v)] > w.y|lX(v) =e — y/e} 
e€{0, 1}4 
(5.40) 


< 2? exp[-C (wr y)^/(a27)*] 


for some C > 0. Similarly, X (v + €27") — X (v) has mean 0 and variance bounded 
by B’(a2~")$ /c* for some B’ > 0 in view of (2.7). Hence by choosing C small 
enough, 


P] sup c[X(v + 6&2-"aA,) — X(v)] > Bwr) 
e€{0, 1}4 


(5.41) , 
< 2! exp[- C (w, B)? / (a2) ]. 


118 H. P. CHAN AND T. L. LAI 


Let 4 := 2f /£? > 1. Combining (5.39)-(5.41) with P(X(v) € c — dy/c} ~ 
V (c)e? dy then yields 


P(Fh' E.) 


(5.42) € (1--o(1)V (0 $277 J : exp[y — Cn” y?/(4a! x”) + C'r]dy 
rz Y 


oO 
+X 2 exp[- Cn £^c^ / (4a! k^) + C'r] 


r=0 


for some C’ > 0. Let ya = a” and take 8? > (2a$x?)/C +A with A > 0. Then 
for all large c and yg < y x c, (5.42) is bounded above by w(c)N,g(yv), where 
oo oo 
Naty) 2 27 | | expty — Cn'y?/ (haf?) + C'rly 
y | 


r=0 


+ exp[-C Ay? /(Aa* x?) + c'i} 
satisfies Na (Ya) + ho y! Na(ya + y)dy — o(a?) forall s >Oand p>0. O 


PROOF OF THEOREM 4.1. Take ô >Q and let D = {t:28 « tj; < 1—28 for 
all i), 


«0-(rFoü-r0) 7, r= ink c0, 


Xe) = Za, ()/v(), — bet) —c/v(t). 


(5.43) 


Note that X.(t) has mean 0 and unit variance. We now show that conditions (C) 
and (B1)-(B5) with D, = D hold for X,(t). In view of F(t + u) = F(t) + 
u'V F(t) + o(llu|) and a similar Taylor expansion for c (t + u), 


pc (t, t - u) m F((0-— F(t +u) {Trt +u) 
=: ] — (1 + o(1))v' V F(0/[2F()(1 — F(t))} 


as u — 0. Hence (C) is satisfied with L(llul) = 1, œ = 1, r) = uVF(t)/ 
(2F(t)(1 — F(t))} and A; = (2c2)7!. Since c = o(n/5) and Ji Z,, (t) is a 
sum of 1.i.d. bounded random variables, (B1) holds by moderate deviations theory 
(cf. [15], Theorem 16.7.1). Moreover, conditioned on Fn, (t) =x, Fa, (t +uAz) = 
x + W/n,, where W is a Binomial(n, p) random variable with n = ne — x and 
p={F(t+uA,) — F())/(1 — F(t)). Making use of this and the functional cen- 


MAXIMA OF RANDOM FIELDS 119 


tral limit theorem, it can be shown that (B2) holds and 
Elz[Xc(t +uAz) — XcG)IXc(t) =z — y/2] 
> —u' VF(t)/(4F(t)(1— FEO), 
Cov(z[Xe(t + uAz) — Xc(f)], zLXc(t + vA) — Xc(01IXc(0) =z — y/z) 


(5.44) 


(5.45 ] 
OF 
— ) min(uj, vi) - ONO - F())}, 


i=l 

uniformly over bounded, nonnegative values of y and over t € [D]s and c/2 < z < 
c/t*. Note that (5.44) and (5.45), which are analogous to (3.12) and (3.13), give 
the mean and covariance functions of the limiting Gaussian process W, (u) in (B2) 
and are in agreement with the œ and r; of condition (C). The proof of (B3) and (B5) 
uses ideas similar to those in the proofs of Lemma 3.6(ii) and Lemma 3.7, together 
with large deviation (instead of Berry—Esseen) bounds for sums of 1.1.d. bounded 
random variables. 

To prove (B4), we modify the preceding proof of (A4) in Theorem 2.1 as fol- 


lows. Leta > 0, 1 «8$ < A/2, k = 20E, wr = ET /2x and c/2xz xc/t*. 


Pick r, such that 0 < 2:zn. are 20, in which 0 will be specified below. For 


u € Ty 42 A, 2272 Az 


Zne (v) > Za, (u) — nl" LF (v) — F(v — 27 A;j)] 
(5.46) 
> Zn, (u) 0 wz, 
where w > 0 can be made arbitrarily small by choosing 0 large since ni! *9~T2 lies 
between 6~!z and 6~!z/2. Making use of (5.46), we can choose 6 large enough 
such that 


| o op zDX 0 - X] > y] nao) <2 ~y/(22)} 
(5.47) v-a? "Z Az ,a27*Z Az 
== 07, 


For fixed z and r > 0, define F, E. ;, B, and E, (r > 0) by (5.38) in which 
c is replaced by z and X(-) by X,(-). Then P(F N E.1) x P(F N E,) + 
aoa. P(E;~1 N Ez). We can then proceed as in the preceding proof of Theo- 
rem 2.1, using bounds for binomial (instead of normal) tail probabilities. 
Verification of (C) and (B1)-(B5) enables us to apply Corollary 2.9 af- 
ter introducing the change of variables t — F(t) so that F is a distribution 
function on [0,1]? (see the paragraph preceding Theorem 4.1). By (5.44), 
(5.45) and Lemma 2.3, H(t) = (TTE (8 F /at;)(t)}/{4F (D) — F(t))}". More- 
over, infyep 1/{F(£)(1 — F(0)) "(2 bp) = 2 when 8 is sufficiently small, and 
IV2 b(t)| = |VFO|7{FOd — F(0)) 77 and F(t) = 1/2 for all t e M. Hence, 


120 H. P. CHAN AND T. L. LAI 


applying Corollary 2.9 to D and then letting 6 — 0, we obtain (4.4). Since the 
probability of joint occurrence of (sup, Zn, (t) > c) and (inf; Zn, (t) < —c} is neg- 
ligible compared to (4.4), (4.5) follows from (4.4). O | 


REFERENCES 


[1] ADLER, R. (2000). On excursion sets, tube formulas and maxima of random fields. Ann. Appl. 
Probab. 10 1—74. MR1765203 
[2] ADLER, R. and BROWN, L. (1986). Tail behaviour for suprema of empirical processes. Ann. 
Probab. 14 1—30. MR815959 
[3] ALBIN, J. M. P. (1990). On extremal theory for stationary processes. Ann. Probab. 18 92-128. 
MR1043939 
[4] ALDous, D. (1989). Probability Approximations via the Poisson Clumping Heuristic. 
Springer, New York. MR969362 
[5] BERKES, I. and PHILIPP, W. (1979). Approximation theorems for independent and weakly 
dependent random vectors. Ann. Probab. 7 29-54. MR515811 
[6] BERMAN, S. M. (1982). Sojourns and extremes of stationary processes. Ann. Probab. 10 1-46. 
MR637375 
[7] BICKEL, P. and ROSENBLATT, M. (1973). Two dimensional random fields. In Multivariate 
Analysis III (P. R. Krishnaiah, ed.) 3-13. Academic Press, New York. MR348832 
[8] CHAN, H. P. and LAI, T. L. (2003). Saddlepoint approximations and nonlinear boundary- 
crossing probabilities of Markov random walks. Ann. Appl. Probab. 13 394—428. 
MR1970269 
[9] CHAN, H. P. and LAI, T. L. (2004). Maxima of random fields and strong limit theorems for 
sums of linear processes with long-range dependence. Technical report, Dept. Statistics, 
Stanford Univ. 
[10] CHAN, H. P. and LAI, T. L. (2004). Excursion sets of asymptotically Gaussian random fields 
and applications to signal detection. Technical report, Dept. Statistics, Stanford Univ. 
[11] CHAN, H. P. and LAI, T. L. (2004). Tail probabilities of scan statistics and asymptotically ef- 
ficient tests and on-line detection on structural changes. Technical report, Dept. Statistics, 
Stanford Univ. 
[12] CHUNG, K. L. (1949). An estimate concerning the Kolmogoroff limit distribution. Trans. Amer. 
Math. Soc. 67 36-50. MR34552 
[13] DAVYDOV, Y. A. (1970). The invariance principle for stationary processes. Theory. Probab. 
Appl. 15 487-498. MR283872 
[14] DUDLEY, R. M. and PHILIPP, W. (1983). Invariance principles for sums of Banach space 
valued random elements and empirical processes. Z. Wahrsch. Verw. Gebiete 62 509-5572. 
MR690575 
[15] FELLER, W. (1971). An Introduction to Probability Theory and its Applications 2, 2nd ed. 
Wiley, New York. 
[16] FERNIQUE, X. (1964). Continuité des processus Gaussiens. C. R. Acad. Sci. Paris 258 
6058-6060. MR164365 
[17] GUT, A. (1980). Convergence rates for probabilities of moderate deviations for sums of random 
variables with multidimensional indices. Ann. Probab. 8 298-313. MR566595 
[18] HARRISON, M. (1985). Brownian Motion and Stochastic Flow System. Wiley, New York. 
MR798279 i 
[19] Ho, H. C. and HSING, T. (1997). Limit theorems for functionals of moving averages. Ann. 
Probab. 25 1636-1669. MR1487431 
[20] HOGAN, M. L. and SIEGMUND, D. (1986). Large deviations for the maxima of some random 
fields. Adv. in Appl. Math. 7 2—22. MR834217 


MAXIMA OF RANDOM FIELDS 121 


[21] HoTELLING, H. (1939). Tubes and spheres in n-spaces and a class of statistical problems. 
Amer. J. Math. 61 440—460. MR1507387 

[22] JENNEN, C. and LERCHE, R. (1981). First exit densities of Brownian motion through one- 
sided moving boundaries. Z. Wahrsch. Verw. Gebiete 55 133-148. MR608013 

[23] KOMLOS, J., MAIOR, P. and TUSNÁDY, G. (1975). An approximation of partial sums of in- 
dependent RV's and the DF I. Z. Wahrsch. Verw. Gebiete 32 111-131. MR375412 

[24] LAI, T. L. and SHAN, J. Z. (1999). Efficient recursive algorithms for detection of abrupt 
changes in signals and control systems. JEEE Trans. Automat. Control 44 952—966. 
MR1690539 

[25] MARCUS, M. B. and SHEPP, L. A. (1972). Sample behavior of Gaussian processes. Proc. 
Sixth Berkeley Symp. Math. Statist. Probab, 2 423—441. Univ. California Press, Berkeley. 
MR402896 

[26] OREY, S. and PRUITT, W. (1973). Sample functions of the N-parameter Wiener process. Ann. 
Probab. 1 138-163. MR346925 

[27] PHILIPP, W. and STOUT W. F. (1975). Almost Sure Invariance Principles for Partial Sums of 
Weakly Dependent Random Variables. Amer. Math. Soc., Providence, RI. 

[28] PICKANDS, J. (1969). Upcrossing probabilities for stationary Gaussian processes. Trans, Amer. 
Math. Soc. 145 51—75. MR250367 

[29] PITERBARG, V. (1996). Asymptotic Methods in the Theory of Gaussian Processes and Fields. 
Amer. Math. Soc., Providence, RI. MR1361884 

[30] QUALLS, C. and WATANABE, H. (1972). Asymptotic properties of Gaussian processes. Ann. 
Math. Statist. 43 580-596. MR307318 

[31] QUALLS, C. and WATANABE, H. (1973). Asymptotic properties of Gaussian random fields. 
Trans. Amer. Math. Soc. 177 155-171. MR322943 

[32] Rio, E. (1993). Strong approximations for set-indexed partial sums via KMT constructions I, 
II. Ann. Probab. 21 759—790, 1706-1727. 

[33] SMIRNOV, N. V. (1944). Approximate laws of distribution of random variables from empirical 
data. Uspekhi Mat. Nauk 10 179—206. MR12387 

[34] STRASSEN, V. (1967). Almost sure behavior of independent random variables and martingales. 
Proc. Fifth Berkeley Symp. Math. Statist. Probab. 2 315—343. Univ. California Press, 
Berkeley. MR214118 

[35] WANG, Q., Lin, Y. X. and GULATI, C. M. (2003). Strong approximation for long memory 
processes with applications. J. Theoret. Probab. 16 377—389. MR1982033 

[36] WEYL, H. (1939). On the volume of tubes. Amer. J. Math. 61 461—472. MR1507388 

[37] WICHURA, M. (1985). Boundary crossing probabilities for Brownian motion. Technical report, 
Dept. Statistics, Univ. Chicago. 

[38] ZIMMERMAN, G. (1972). Some sample function properties of the two-parameter Gaussian 
process. Ann. Math. Statist. 43 1235-1246. MR317401 


DEPARTMENT OF STATISTICS DEPARTMENT OF STATISTICS 

AND APPLIED PROBABILITY STANFORD UNIVERSITY 
NATIONAL UNIVERSITY OF SINGAPORE STANFORD, CALIFORNIA 
REPUBLIC OF SINGAPORE 119260 USA 


E-MAIL: stachp @nus.edu.sg E-MAIL: laitGstat.stanford.edu 


The Annals of Probability 

2006, Vol. 34, No. 1, 122-158 

DOI: 10.1214/009117905000000594 

© Institute of Mathematical Statistics, 2006 


A GAUSSIAN KINEMATIC FORMULA! 


. BY JONATHAN E. TAYLOR 
Stanford University 


In this paper we consider probabilistic analogues of some classical 
integral geometric formulae: Weyl-Steiner tube formulae and the Chern- 
Federer kinematic fundamental formula. The probabilistic building blocks are 
smooth, real-valued random fields built up from 1.1.d. copies of centered, unit- 
variance smooth Gaussian fields on a manifold M. Specifically, we consider 
random fields of the form fp = F (y1 (p), ..., Ye(p)) for F € C?(R*; R) and 
(Y1; -.., Yk) a vector of C? i.i.d. centered, unit-variance Gaussian fields. 

The analogue of the Weyl-Steiner formula for such Gaussian-related 
fields involves a power series expansion for the Gaussian, rather than 
Lebesgue, volume of tubes: that is, power series expansions related to the 
marginal distribution of the field f. The formal expansions of the Gaussian 
volume of a tube are of independent geometric interest. 

As in the classical Weyl-Steiner formulae, the coefficients in these expan- 
sions show up in a kinematic formula for the expected Euler characteristic, x , 
of the excursion sets M N fu, +00) = M y "1 (F-l[u, +00)) of the 
field f. 

The motivation for studying the expected Euler characteristic comes from 
the well-known approximation P[sup peM fip) =u] E[x(f -l [u, +0o))]. 


1. Introduction. In this paper we consider perhaps the simplest non-Gaussian 
models of smooth random fields on a manifold M. Our fields, which we refer to as 
Gaussian related, are smooth, real-valued random fields built up from i.i.d. copies 
of centered, unit-variance smooth Gaussian fields on a manifold M. Given a C?, 
centered, unit-variance Gaussian process y on M and F € C?(R*; R) we consider 
the field 


fp mi F(yi(p). elt). yxCp)). 


where the fields y; are i.i.d. copies of y. 
For a concrete example: set 


F(x) = |x|’, 


Received November 2002; revised November 2004. 
1 Supported in part by Natural Sciences and Engineering and Research Council and the Israel Sci- 
ence Foundation. 
AMS 2000 subject classifications. Primary 60G15, 60G60, 53A17, 58A05; secondary 60G17, 
62M40, 60G70. 
Key words and phrases. Random fields, Gaussian processes, manifolds, Euler characteristic, ex- 
cursions, Riemannian geometry. 


122 


A GAUSSIAN KINEMATIC FORMULA 123 
then the field 


F(p) = F(nG....»x()) 
se 299 


has a xé marginal distribution, and has previously been referred to as a “xý 
field [1, 20]. Note that, unlike a Gaussian field, a field with x? marginal distri- 
butions is not determined solely by its covariance function. In this work the fields 
are characterized by their covariance functions and the function F. 

The motivation for studying the expected Euler characteristic of the excursions 
f [u, +00) C M comes from the approximation 


P| sup fp «| ~ E[x (f! fu, +00))] 
peM 


[1, 2, 20, 22]. This approximation has found uses in medical imaging, astrophysics 
and multivariate analysis [15, 16, 20, 23]. A heuristic justification of the above 
approximation can be found in [2, 6], with a rigorous justification in [17]. 

Our main result, Theorem 4.1, expresses the expected Euler characteristic, x, 
of the excursion sets f -l[u, +00) 


a 


E[x (M n f ^! (u, +00))] 


in terms of geometric quantities related to the curvature of the Riemannian struc- 
ture induced by the covariance function of y [18] and geometric quantities related 
to the curvature of F~![u, +00) C R*, viewed as a subset of the probability space 
(IR, Ypk), where yg« is the standard Gaussian measure on Re: 


Vp (A) = | (22) ek? gy, 
A 


That is, the geometric quantities related to F^! [u, +00) depend on both the cur- 
vature of F^ !(u, -Foo) as well as the measure ypx. 

In Theorem 4.1, curvature enters as coefficients in certain volume of tubes ex- 
pansions. As these expansions are slightly nonstandard, we recall some basic facts 
about such expansions, dating back to Weyl and Steiner [12, 14, 19]. These ex- 
pansions give an expression for the volume of a tubular neighborhood of a set 
M CR", assumed either to be convex or an embedded submanifold, in terms of 
certain intrinsic measures on M, the so-called Lipschitz—Killing curvature (signed) 
measures (£L; (M, -))o<j<n. These measures depend on M and are finitely addi- 
tive in M. When M is an embedded manifold in IR", the tube formula is due to 
Weyl [19], and when M is a compact, convex domain, the expansion is generally 
attributed to Steiner (cf. [14]). The two formulae were generalized to sets of pos- 
itive reach by Federer [11] who defined the Lipschitz-Killing curvature measures 
of such a set. 


124 J. E. TAYLOR 


Weyl and Steiner's formulae state that for r small enough 
J&(T(M,r)) Ê Helly € R* :d(y, M) <r}) 23g (by € R* :4(y, M) <r}) 
k . k . 
= Y £M, Ror r7 $ Y £j Unos jr, 
j=0 j=0 
where d (-, M) 1s the standard distance function on R”; oy = x */^/ T(k /24- 1), the 
volume of the unit ball in IR*; J6, is k-dimensional Hausdorff measure and Ag is 


Lebesgue measure. It is sometimes useful to rewrite (1.1) in terms of Minkowski 
functionals defined by | 


My— ; CM) 
(k — j)! 
so that (1.1) reads as a (finite) Taylor series expansion 


= £; (Mon; 


: k r j 
Ag (T(M, r) = > Mj(M)— 
j=0 

One of the deep facts about the Lipschitz—Killing curvatures, first proven by 
Weyl, is that they are intrinsic to M. That is, if we embed M into a different 
Euclidean space in an isometric fashion, the Lipschitz—Killing curvature measures 
of S are unchanged. They are also intrinsic in the Riemannian sense, because they 
are local and can be computed from a Riemannian metric on M. 

However, because 4^ ; (M) are intrinsic to M, the Minkowski functionals of M 
are not intrinsic. In particular they depend on the dimension of the Euclidean 
space in which M is embedded. Strictly speaking, it is therefore necessary to write 


Rt (M) to clarify which Euclidean space we are talking about, where Ay refers 
both to which dimension M is considered to be embedded in as well as which 
measure we are using to compute the volume of the tube. In Section 3 we will 
compute the volume of a tube with measures other than Lebesgue; specifically 
we will compute a Taylor series expansion of the standard Gaussian volume ypi 
of certain tubes. These expansions will play a key role in our analogue of the 
Chern—Federer kinematic fundamental formula (KFF) [7, 10, 11, 13] in which the 
Lipschitz-Killing curvatures also play a prominent role. 

The KFF relates the “averaged” jth Lipschitz-Killing curvature £ ; (M; N gM») 
to the Lipschitz-Killing curvatures of M; and M5, averaged over "typical" rigid 
motions g. Specifically, for Mı and M2, two embedded submanifolds of IR^, the 
following relation holds: 


f. LM gM) du) 
(1.2) ""- , 
= Y| : 7| H £j OM) Exi (Md), 


l 
imo J 


A GAUSSIAN KINEMATIC FORMULA 125 


HRO 
P| Xi ojo i 


and Gg = R* x O(k) is the group of rigid motions on IR^ with Haar measure 
Hk = Ape X fiy, the product of Lebesgue measure and the invariant probability 
measure on O (k). 

The Chern-Gauss- Bonnet theorem [9, 11] states that £o(-) = x(-), the Euler 
characteristic, so that we can rewrite the case j = 0 in (1.2) as an “averaged,” or 
"expected," Euler characteristic 


where 





| k E 
a3 f xau nsuodu - Y); | Li Lei), 

; i0 
keeping in mind that the measure jz, is not a finite measure. 

This brings us to the subject studied in our work here. In this work we study 
the expected Euler characteristic of certain random sets derived from smooth zero- 
mean, unit-variance Gaussian random fields on a C? manifold M. Our main results 
are Gaussian analogues of (1.1) and (1.3), given in Corollary 3.4 and Theorem 4.1. 

Specifically, let y = (y1,..., yk): M — R* be a sequence of i.i.d. zero-mean 
unit-variance Gaussian random fields (satisfying certain regularity conditions 
specified in Theorem 4.1 below) on a C? manifold M. We derive, using techniques 
similar to [1, 3, 18, 21], an expression for E( (M N y 1 D)], where D is a suitable 
C? domain in RÝ. In particular, we show in Lemmas 2.5 and 2.6 that if M is a C? 
n-manifold, with or without boundary, and for suitable F € C?(R^; R) 


(1.4) E[x (M N y! (F1 fu, +00)))] = Y £;(M A) (F, u), 
j=0 


where the functions 9;(F,u), for j > 1, are Gaussian analogues of AM^n* and 
are given by integrals of certain functions over F~!{u} = 8(F7![u, +00)) c R*. 
In (1.4), the quantities (£;(M))o<j<n are the (total) Lipschitz-Killing curvature 
measures of the Riemannian manifold (M, g), where g is the Riemannian metric 
induced by the random field f 


8(Xp: Yp) - E[Xp f Ypf] 


for tangent vectors Xp and Y, in T, M. 

When the y;'s are defined on IR" and are isotropic, the functionals 0;(F, u) 
agree with the EC densities py ;(u) of F o y (cf. [21]), where it is shown that for 
compact C? domains S 


(L5 .. E[x(S$n f! pu, +00))] = Y ajn M; Su} py. j(U), 
c | 


126 J. E. TAYLOR 


where 442 is a spectral parameter of f, namely the variance of its first partial deriv- 
atives and a;,, are constants, independent of 5 or f. It should be noted that (1.5) 
holds for any smooth isotropic field on R”, not just Gaussian-related fields, that 
is, those derived from i.i.d. Gaussians. Lemmas 2.5 and 2.6 are thus extensions of 
(1.5) to fields derived from i.1.d. Gaussians on manifolds. Further, these lemmas 
show that in order to compute 


E[x(Mn y (Fu, +o0)))], 


we need only compute (.£ ;(M))o<j<n and 0;(F, u). 

Lemmas 2.5 and 2.6 also provide a direct way to compute 0; (F, u), and hence 
p f, j (u) for isotropic Gaussian-related fields of the form F o y, in that they are 
represented as conditional expectations of random variables defined on IR^ and 
do not contain any information concerning the derivative of the fields themselves. 
This should be compared with earlier works (cf. [8, 20]) where py, ;(u) (assuming 
without loss of generality that 42 = 1) is expressed as 


(1.6) py, j(u) = Eldet(V? fij IV fij; —0, fij; = “ev fij, (0), 


where V? fka = 9? f (t)/0x, 0x; is the Hessian of f (t), fij is the restriction of f 
to a j-dimensional subspace of R” and qv f; (0) is the density of V f|; ; for some 
fixed t. In these works, it was therefore necessary to work out the full joint distri- 
bution of the field along with its first two derivatives at a given point ¢ and then 
carry out a fairly delicate conditioning argument. 

The connection between the functionals 5; (F, u) and kinematic formulae is re- 
lated to an extension of the Weyl—Steiner tube formulae. We discuss this extension 
in Section 3, where we derive a formal power series expression for the standard 
Gaussian volume of a tube around D, that is, we derive a formal expression for 


oo j 
(1.7) | (ryle I? dau (x) = Y MOR (D). 
T(D,r) o á j! 


The notation M (D) is chosen to highlight the analogy between MR (D), the 
coefficient of r/ /j! in a formal expression for the Gaussian volume of T (D, r) and 


AC (D), the coefficient of r/ /j! in a formal expression for the Lebesgue volume 
of T(D, r). Both expansions are exact in some cases, for instance, when D or D^ 
is a compact C? domain. 

The “take-home message" of this work is Theorem 4.1, where we link Sections 


2 and 3. We prove, by direct computation, that for suitable F € C? (RE; R), 
pj (F, u) — (2r) JP MR (Fu, --oo)), 


that is, the EC densities o;, f(u) for Gaussian-related isotropic fields are (up to con- 
stants) coefficients in a Gaussian volume of tubes expansion around F^ ! [u, +00). 


A GAUSSIAN KINEMATIC FORMULA 127 


We can thus rewrite (1.4) as 


(1.8) E[x (Mn y 1 D] = Y^ LiM ry 7 AC (D). 
j=0 


We conclude with some examples in Section 5, specifically rederiving, in light 
of (1.8), the EC densities of Gaussian and x? fields. While these are not new, their 
derivation sheds some light on the origin of the formulae. For instance, although 
it has been shown that the EC densities for a real-valued Gaussian field are essen- 
tially just Hermite polynomials times the standard Gaussian density, (1.8) shows 
that the reason the Hermite polynomials appear is the fact that they are derivatives 
of the Gaussian densities, hence (up to constants) the coefficients of powers of r 
in a power series expansion of the Gaussian measure of the tube [u — r, +00) of 
radius r around [u, +00). 

The new results in Section 5 are the EC densities of the noncentral x? fields, as 
well as the EC densities of what we refer to as the "correlated conjunction" ran- 
dom field which is given by taking the minimum of two correlated Gaussian fields 
defined on a manifold M. Specifically given p € (—1, 1) and y = (y1, y2) two in- 
dependent centered, unit-variance Gaussian fields on M, define the new random 
fields 


z1 =J], 
z2 =p: yi +41- p*- yo, 
zı ^ ż2(p) = min(zi(p), z2(p)). 


In Section 5.4 we derive the EC densities of z1 ^ z2 by relating the EC densities 
to the Gaussian measure of tubes around arbitrary cones in R^. The “correlated 
conjunctions" is a simple model for correlated random sets: the independent model 
(p = 0) has been used in certain neuroimaging applications [24]. 

The outline of the paper is as follows: Section 2 is dedicated to deriving an ex- 
plicit expression for the EC densities py ;(u) = p; (F, u). Section 3 is dedicated to 
extending the Weyl-Steiner tube formulae to measures with smooth densities with 
respect to Lebesgue measure and is possibly of independent interest. In Section 4 
we show that the functionals 5; (F, u) are, up to constants, coefficients in a Taylor 
series expansion for the standard Gaussian measure of a tube around F ~iu, +00). 
Section 5 is devoted to examples. 


2. Euler characteristic densities. In this section we derive an explicit inte- 
gral representation for the EC densities 0; (F, u). The proof involves some prelim- 
inary lemmas necessary to carry out the conditioning in (1.6). 


128 J. E. TAYLOR 


2.1. Gaussian random fields on manifolds. In this section we derive certain 
expectations for Gaussian random fields on a manifold M. We first review some 
linear algebraic preliminaries; readers are referred to [3, 18] for further details. For 
an n-dimensional vector space V, let (7 * (V), &) denote the algebra of covariant 
tensors on V; (A*(V), A), the Grassman algebra and (A*(V) & A*(V), ), the 
algebra of double forms A"(V) 8 A"(V) = Qro A^ (V) = Gra 4A (V) 8 
A‘(V) endowed with the “double-wedge” product 


(a & B): (y 88) = (a ^y) @ (BAS). 
The subalgebra (Dy 99 A^ J (V), -) is denoted by (A**(V),-). Any inner prod- 
uct (-,-) on V inducas an inner product on A*(V), and the trace, Tr: AQ) | 
A" CV) — R is defined by 
Tr(a & B) = (a, P). 


and extended linearly. 

We call W:(Q, £,P) — AbP!(V) a Gaussian double form if, for any basis 
By = {v1,..., Un} of V, the matrix W(v;, vj) has (jointly) Gaussian entries. 

We recall the following useful lemma from [18]. 


LEMMA 2.1. Suppose that W is a Gaussian double form; then 


QUU y ——- 
EIW^]z y ————u* “ A 
[W] 2 k—2ijm" 


where 
p*E[(W]e Ally), | C&E[(W —EIW]?] e A*?(v). 


The next result we will need in subsequent sections concerns the conditional 
expectation of certain random double forms on finite-dimensional Hilbert spaces. 
Specifically, let (V, (-, -)) be a finite-dimensional Hilbert space and L C V, a sub- 
space of V with Pr, denoting orthogonal projection onto L. Suppose now that 
(Xi) 1<i<N<dim(v) are ii.d. V-valued random variables with common distribu- 
tion yy, the canonical Gaussian measure on V 


di ijv]? 
Vy (A) = (20)? f, e VT d Hamy) W). 


We evaluate the expression , 
(2.1) E[o((Xi,. ..., X4), (X5. ..., X ,))| Pr X1, ..., PLXN], 


fora € A**(V) and k-tuples ig = (i1,..., ix) and jy = (ji, ..:, jx) of {1,..., N). 

Before evaluating (2.1), we recall the definition of the annihilation operator 
on 7 *(V), as well that of a contraction on A*(V) & A*(V). For v € V, the anni- 
hilation operator i, on A*(V) by defining it on A" (V), as follows 


(iyB)(v1, 8 wow s Ur—1) = Biv, V1; Ludus Ur—1) 


A GAUSSIAN KINEMATIC FORMULA 129 


and, if B € AV), we set i, B = 0. Although it is referred to as an annihilation 
operator, its real effect is to fix the value of the first coordinate to be v when the 
form i, f is evaluated on r — 1 other vectors. Since A*(V) & A*(V) consists of two 
"copies" of A*(V) the annihilation operator can act on either the first or second 
copy. To distinguish between these two cases we define operators y, and n’, on 
A*(V) & A*(V) by setting 


nE & y) — iyB Gy, 
ny(B@ vy) =B @ivy, 
and extending linearly. 
For a € A*(V) & A*(V) | 
v € = ~My tM, A, 
(2.2) My, My 0 = —Nyy My, % 
Ny, Moy = T]; My, Qt, 


as is easy to check from the definition. Also 


k 
a((Xi,, a Xi,), (X5, "t$ X 4)) -— (I Xi, i, e 


1-1 
so that the evaluation of (2.1) 1s really the evaluation of the conditional expectation 
of the polynomial 


k 
| [ 0x, nx, € L(A"), A*(V)), 
[== 


a random element of the set of linear maps from A*(V) — A*(V). 


Fix (vi, ..., Vdimcv)}, an orthonormal basis for (V, (.,-)) and L C V a sub- 
space with orthonormal basis (vi. xD, ). We define the contraction operators 
C CI Cr on A*(V) & A*(V) as follows: 
| dim(V) l 


Ca = 2 Ny; Ny, Os Cra = > Ny! My, 
i=] i=l 
Cła = (C — Cp) = Crua, 


where L+ is the orthogonal complement of L in V. For any two subspaces L1, L2, 
(2.2) implies 


Cr, CL, = CLC, 
and for a € A**(V) 
Cf (œ) =k! Tr” (œL), 


130 J. E. TAYLOR 


where or, is the restriction of o to L. 
With the notation established, we proceed to evaluate the conditional expecta- 
tion (2.1). | 


LEMMA 2.2. Suppose a € A*(V) & A'(V), Xi, 1 <i <N, are iid. 
V -valued random variables with distribution yv, and L,..., Ly'are subspaces 
of V. Suppose further that ip = (i1,..., ip) and jg = (ji, ..., jg) are two p- and 
q-tuples of distinct elements of (1, ..., N}, arranged such that the first n(ip, Ja) < 
min(p, q) elements of i; match those of h and there are no further matches in 
the remaining elements of ip and p The following relation then holds: 


p q 
el (TI nx, (I te, ern on PiyXv| 
[==] [=] 


n (ip, ja) : 
(2.3) = ( |] (Cr, T "Pr; Xi "P Xü ) |I "Pr, s.) 


di m-n(ip, ja)--1, 


4 
/ 
«( |] Tn, x, Je 


m=n(ip,jg)+1 


REMARK. The restriction that the indices in i and Jaq be distinct is to rule 
out the trivial case when either two of the elements of ip or jg are equal and the 


polynomial 
p q ‘ 
[=1 i=] . 


PROOF OF LEMMA 2.2. We consider first the case in which p = q = 1, and 
note that, by linearity, it suffices to prove the lemma for a € A'(V) & A‘(V). 
Working from the definition of x; Nx, 


nx, My, (@)((Wi, -o Wri) ud... Wh) 


n 


= (Xi v)yXj, wo ((ve, wi... Wr—1), Qr, wj... w, 1), 
k =l 


where B = (vi,..., Vdim(v)] is some orthonormal basis for (V, (-, )). If i = ji, 
we can further choose the orthonormal basis for V so that (vi, ..., Vdim(L;, )} is an 
orthonormal basis for Lj, and {vgim¢z;,)+1,--+» Vdim(v)} is an orthonormal basis 


for ES. The result then follows from well-known results about the conditional 
distributions of Gaussian vectors. 


A GAUSSIAN KINEMATIC FORMULA 131 


Next, consider p, q > 1, 


p q 
e (Tnx, (IT, Joli iat Pay Xu] 
[=z] f=] 
jb d | l 
= {| (Tlx (Tros, apes Asan gA ar Pr, Xj, Xn 


PL des PiyXv| 


"rJ 


x Elnx;, nx, 0X1, "PR Pr, Xii, — PL; Xj, Xu] 
FA pas iX 


= (8: ; C / 
= (6i, jı Cp, ds NPE; Xi "Pr Xj) 


p q 
elf) (fes, ens nan] 
[22 [2 
the general case then follows by iteration. Li 


COROLLARY 2.3. Suppose a € A* (V), X,,..., Xy are i.i.d. with distribu- 
tion yy, iy and jy are as in Lemma 2.2 and vo is a unit vector in V . Then 


E[o((X4, ..., X4), (X5. .... X 4)) (Xi, vo); ..., (Xk, vo] 
k 
L - = 
= Ls. jek} (uns (aut) T D N Xi vo) 0 A owe Cu e) 
[=] 


ut k—1 
T T nds, fk 1] (rai vov I (x uo) vo oe (œ)) 


PROOF, This is just an application of Lemma 2.2 which uses the fact that, if 
the two sets i x and jg differ by more than two indices, then the polynomial resulting 
from the application of Lemma 2.2 will be identically zero, since the term ny (as 
well as 7) will appear at least twice [cf. (2.2)]. C 


For our purposes, we can reformulate the above corollary as follows. 


COROLLARY 2.4. Suppose y = (yi,..., yj): M — R/ are i.i.d. mean-zero, 
unit-variance Gaussian random fields on a smooth manifold M, v is a vector field 


132 J. E. TAYLOR 
on RJ and a € A** (RJ). Then, for any (nonrandom) set of vector fields (Z;)1<i<m 
on M, where k < m < dim(M) 

ELy*aly, «Zi, v), 1 < i < m] 


is a random double form, whose restriction to L = span(Z1, ..., Zm} satisfies, for 
each p € M, 


E[y*ojr ly, (y«Zi, v, 1 xi x m] 


= ((y« Zi, v)" 
2. lucy) I? 


in the sense that for each p € M, on the set {|\v(yp) || > 0] 
IE y*ayz (Zi, peers Zi); (Zis D Z5)) plyp: «Zi, V) p> I < i < m] 


L ; 
— Tr" AOp) vt (yyy) Hr (Zar s... Zu); (Z5. eney Zi)» 


= (Zi, v) p? | 
Kk i ESL tae UNE CT m jb 


for some universal constant Kx, j and all k-tuples (i1, ..., iy) and (j1, ..., Jk). 


L 
= Tr” Vay + o( izle. 


ixl 


PROOF. Without loss of generality, we can assume that the (Z;)j<i<m are 
orthonormal with respect to the metric induced by any one of the y;'s. The re- 
sult then follows from the fact that, for p fixed, (y. Zi(p))1zi«» are, conditional 
on y(p), i.i.d. Gaussian random vectors in Ty(p)R/ with distribution Vr, RJ » COM- 
bined with the previous corollary, along with the fact that for any vector field W 
on R, 


2 
k-1 (Wy), n») 
[C LIP) WO) NP o woe) < Kg, ; HESTE læ) gkg» 


for some universal constant K A R [] 


2.2. Integral representation of p;(F,u). In this section we derive expressions 
for the expected Euler characteristic of the excursions of Gaussian-related fields, 
which we defined to be random fields of the form 


f(p)=Foy(p), 


where y = (y1,..., yk): M —> RÝ are i.i.d. zero-mean unit-variance Gaussian ran- 
dom fields and F € C?(R*). Recall that the EC densities were defined [21] for 
isotropic fields f on R” as follows: 


Q4 E[x(Mnf-!Iu, +00))] = Do ajn ME, Mud” py j(u) 
j=0 


A GAUSSIAN KINEMATIC FORMULA 133 


where the spectral parameter 22 is just the variance of the first-order partial deriv- 
atives of f and p+ ; (u) depend only on the finite-dimensional distributions of f 
and its first two derivatives at any spatial location. Suppose fi = F o yj and 
f? = F o y, are two (not identically distributed) isotropic Gaussian-related fields 
with 22,1 = 42,2, that is, with equal variance of the first derivatives. Then, by in- 
specting the results in [20], it can be shown that the expected Euler characteristics, 
| [u, +00), will be identical (this will in fact be proven in the next lemma). In 
particular, the EC densities are functionals of F, so that, for isotropic Gaussian- 
related fields we can rewrite (2.4) as 


Q5) . E[x(M N flu, +00))] = Y aj nMn—j(M) n] CF, u) 
j=0 


for some functionals p;,0 < j < oo. 
The simple case k = 1 and F(x) = x or F = Id, the identity map, leads to a 
real-valued Gaussian field, for which it is known [1] that 


. oo 
Pj (Id, u) = 2x) U*U^ | H,(x)e* P? dx, 
u 


with H; the jth Hermite polynomial. In the manifold setting, if f is a real-valued 
zero-mean unit-variance Gaussian field on a C? manifold M, it was shown in [18] 
that 


E[x (M N f ^! [u, +00))] = Y^ Lj (M); (01d, u), 
j=0 


that is, that (2.5) holds with aj.,M,—j;(M)A//* replaced by £;(M), where the 
Riemannian metric with respect to which the £,;(M) are computed is the one 
induced by f, that is, 


gp(X p, Yp) =E[X fY f] 


which is assumed to be C?. 

One way to think of the relation (2.5) is as a partitioning of the geometric infor- 
mation from the parameter space and information depending on the distribution of 
the field. The result in the real-valued Gaussian case suggests that this partitioning 
might hold for Gaussian-related fields on manifolds. This would mean that in order 
to compute expected Euler characteristics, we would simply need to calculate the 
Lipschitz- Killing curvatures of (M, g) (recall that g is the metric induced by any 
one of the coordinate random fields y;) and the EC densities of a corresponding 
Gaussian-related isotropic field. The following lemma shows that this is indeed the 
case, and our work is reduced to computing the EC densities of isotropic Gaussian- 
related fields. 


134 J. E. TAYLOR 


We set V fg(p) = (EAf(p),..., Enf (p)) for some C! section of O(M), the 
bundle of orthonormal frames on.(M, g) and ` 


te (y) = (@né") "1 Bgn (0,6) ox A iyi) 


j=l 


an approximation to the Dirac delta, as in [18]. 


LEMMA 2.5. Given y = (y1,..., Yk) Lid. suitably regular centered, unit- 
variance Gaussian random fields on M, a C? n-manifold with or without bound- 
ary, and F € C?(R*) is such that f = F o y satisfies: 


(i) lime+0 El fy, V fp (Ge) Ly pou] = Sy lime +o ELV fp (@e) 1, ru]. 
(ii) every critical point of f in f^l(u — 8,u + 8) for some 8 > 0 is non- 
degenerate (cf. [18]), P-a.s. 


Then 
E[x (M n f^ [u, +00))] = Y; L; (MAF, u) 
j=0 


for some functionals p; (F, u). 


PROOF. We will prove the statement when M has no boundary, omitting the 
calculations for M with boundary, as they are similar to those in Theorem 5.1 
of [18]. Further details can be found in [3]. 

Following the arguments in Theorem 4.1 of [18], under the assumptions of the 
theorem, 


Elx(Mn f^u, 4-co))] 


- J lim B[V fr (Ge) 11 ¢>u} | 


Í 
em lim E[t fzu, lV fell<e} Te” -V° f )”)] Volm,g 


M £0 


1 
=— | lim E[t dv fs xia ELT” (— v^ py, V fel] Vola, . 


MeO 


We will therefore devote our attention to evaluating 
E[C- V^ AIS, V fe] e A^" (M), 
which we will show, for each p € M is of the form 


In/2] 1 
(2.6) E[(—V? f)"|f, V felp = 2j 1 gas) 


A GAUSSIAN KINEMATIC FORMULA 135 


for some random double forms A;(p) [measurable with respect to o (f (p), 
V fg (p))]. Since the distribution of the random vector (f (p), V fe(p)) € Ra 
is independent of spatial location, this will prove the claim. To see that the distrib- 
ution of the pair (f (p), V fg(p)) is independent of spatial location, note that 


and 





: OF 
V fgi(p) = Ei f (p) = 5», Eiyj(p): z— 


= OY; LOP) v (0) 


Since E is an orthonormal set of frames, the Ej; y;(p) are i.i.d. standard normal 
random variables, as are the (y1(p), ..., ye(p)) and the two sets of Gaussian ran- 
dom variables are independent of each other. 

We now turn to (2.6). Conditional on y and y,, V? f is a Gaussian double form. 
Further (cf. Section 2.4 of [18]), 


k 
OF 
ELV? fly, y*] 2 y* V^F — (Yn) 
i=l if 


= y*V?F + (VF, ~VI\x|7/2)le=y 
=y'V°F + (VFO), y), 


2 E (aps 
E[(V* f — E[V* fly, yD ly, ya] = -Q? 38) (=) 
jay SOM! 


= —(17+2R)|\VF I’. 


By Lemma 2.1, it follows that 
| 1 
- EIC V^ fl, y*] 


b) (aye » 
= Y cup VF (FO) yH)" IVE? + 20) 


£3 ( - 2k) 
\n/2) 1 | 
(2.7) -—9,-xCRyÓIvFI" 
E d 
j=0 


n—2j ¢. «72 l 
«S3 o*CVF/IVFD) 
=O l! 


(—1)? j-i ET 
mo peu FOI FOL y). 


136 J. E. TAYLOR 
By Corollary 2.4, 
E[(y* V^ F)'|y, V fel = EL* V? FY |y, (y«Ei, VF), ..., (y« En, VF)] 
= TY (V? Ep)! I + Er(V fe, y), 
where 


AP (M) 3 Ern (V fe, y) = O (IV felis CV? FY’ llezt pe) 


in the sense described in Corollary 2.4. Combining this with (2.7) 


] 
-ELVI fly V fel 


[n/2] EE 
-( 2. 4C RY IVE 
j=0 J 


n—2j (= 1y1-2j-1 


x 2. (n —2; pi PV FO), y) 


NEN 
x j 1^ CYR wr IF) + Em" (V fe, y) 


where, up to constant factors 


In/2] n-2j 


Em" (V fg, y) 2 X Y RI I 77 BH, 5; 4((VF(y), y))Em(V fe, y), 
j-0 1-0 


from which the claim readily follows. LI 
We have now reduced the problem of calculating expected Euler characteristics 


for Gaussian-related fields to the isotropic case. The following lemma gives an 
explicit integral representation for these EC densities. 


LEMMA 2.6. Suppose f = F oz satisfies the conditions of Lemma 2.5 
where z = (z1,...,zk) are i.i.d. isotropic, zero-mean unit-variance suitably reg- 
ular Gaussian fields on R” that induce the standard Riemannian structure on R”. 
Further, suppose that, for some à > 0: 


(i) there exist C1, Cp > 0 such that on F^! (u — 8, u + 8) 
C, < [IV F(x)| < C»; 
(ii) VF is Lipschitz on F7! (u — ô, u + 8); 


A GAUSSIAN KINEMATIC FORMULA 137 
(ii) the functions 
Gn., F () 


7 m" IV FOI e (OQ) e BE? d 3G) 


4 Í IV. Fol Hai (V FO), y) 
ph 


x TY O (—v?FG)/IV FG) Qn)“ BE? 136, (x) 


are continuous on (u — ô, u + 8); 
Gv) for all n,i 


e 2 l . 
lim 5; El Lr O)-ul<e}|Hn-1-1 (-(V FO), YV FG) longi] < 00. 
Then, 


BF uw) = 5 Y C077 (^ 11) 65402. 
Ory 2 ' 


PROOF. Using the original representation of the EC densities in [1] for random 
fields on R” 


Dn (F, u) = pfn (U) 
- E (2) any; = u, V fna =0 Joco, a0 


n 








ce uM 
= lim zou E Mirae av facade (=~) det(V fin-1)}, 
where 
9f of ) 
V fin-1 = | ——,..., 
fini & 9t, —1 


is the vector made up of the first n — 1 coordinates of V f, g¢v fin-i is the joint 


density of (f, V fi, 1) and V? fi. is the matrix of the Hessian of f in the first 
n — 1 coordinates. We can rewrite the expression inside the expectation above as 


(E dev? fnn) 


n 


Í 
= mp y«(8/81,)))* 


X (— V? fy'-1((9/81;, e eeg 0/ 0t, 1), (9/011, "on es 0/8tn—1)). 


138 J. E. TAYLOR 


From calculations similar to those in Lemma 2.5 (just restricting V? f to the first 
n — 1 coordinates and reapplying Corollary 2.4), we see that 
af \* 2 af 
Ej { — ] det(—V*fin-ly, —,1<i<n-1 
(SH) aec v hop. 1 sin 1] 


n 


IVEI” C= nme 
Qr) - 2C 1) i = 


Tr (— V? F/V FI + Er" (V fin-1, y) 


) H, 4 a((V FQ), y)) 


VF nn ~ 
ae ya yc p(T) sue + THEA" aso). 


where 
Bre’! (V fin—1, y) 
n-—1 
= Y rni pg, a (V FO). »)90(lV fuil, lC? P) longe). 
l0 


Assumptions (i) and (iv) above imply that 


lim zag EMI O)—ul<e) L faie) TrEn" (Y fii y))[] = 0. 


Passing to the limit for the remaining terms 





u) = lim 
PF.n(u) = naw D PH 


y Pey- (" : é 


[=:0 


IVF)" ~ 
x E| E| tay jy. <e lis-sena Gn eo» | 


= lim zc pee p ) 


£0 2E 


FQ x 


(27 )1/2 Gn. so]. 


- | 
x | 5o. €)1. | F(y)—ul<e} 75172 x 


where 


Ygn-1(Bga-i (0, £/I V FCy)0D) 


no 
089 = aV FOOD 


A GAUSSIAN KINEMATIC FORMULA 139 
By assumption (i), 
lim D(y, £) = Qn) € 97 
for all y e F~!(u — 6, u +8). Further for all such y, 


^ n~ (Brn 0, 
|D(y, ges Cn) < sup et e A = (x) DA A L(e) 
O<r<e/C; Qg—1r7" 


where the function L(z) is increasing and continuous on [0, +00) and 
lim L(2) = 0. 
£0 


Therefore, for € < ô, 


1 FT e£. 
5; Eltiru DG, DIY FO) lO», FO] 


No 1 m 
— (2x) P Eiro- V FO?IGs, rQ)] 


L ~ 
< E [tgp ae IV FOIS, 50D] 


The above implies that if, for some ô > 0, 


1 z 
(2.8) sup —E[1{Fyuj<e}l V FGQIGa, £0)] < +00, 
O<e<d 2€ 


we need only evaluate 
| = 
lim 5, ELLIFO)-ul<e) IVFG)IGa, rO)]. 


The Lipschitz assumption on VF allows us to apply Federer's coarea for- 
mula [5, 11] so that 


T ^ 
lim 5; ELHIFO)-uj<e) IV FOD)II Gs, FC)] 


1 " 
= lim xl | Gate Ge IU 36. (x) dz 
£0 26 (u—e,u-+e) J F-(z) idi 


zo. 4 
= lim > f G1. F(2) dz 
£0 2E J(u—e,u+e) 


= Gg, r(u). 


The Lipschitz assumption on V F and the coarea formula also imply (2.5). O 


140 J. E. TAYLOR 


3. Steiner formulae for Lebesgue and other measures. In the previous sec- 
tion we derived an integral representation for 9; (F, u). Ultimately, we will connect 
this integral representation with a certain volume of tubes expansion, to which this 
section is devoted. In this section we generalize the Weyl-Steiner tube formulae in 
order to compute non-Lebesgue volume of tubes, in particular Gaussian volumes. 

We will limit ourselves to tubes around C? domains D in R* with outward 
pointing unit normal vector field v. That is, D is a closed set with nonempty inte- 
rior such that 3D is an embedded C? hypersurface. We write 


T(D,r) = (y € R*:d(y, D) <7} 
and 


Pp(y) = arg mind(x, y), 
xeD 


the metric projection onto D when the minimum is unique. We recall the definition 
of the critical radius of D, 


rc(D) = sup{r > 0:d(y, D) <r => Pp(y) is unique}, 


and we will assume throughout that r. (D) > 0. 

The following lemma, whose proof we omit, summarizes some facts about the 
distance function of D, particularly the growth of the eigenvalues of the shape 
operator of the hypersurfaces at distance d from D. For a hypersurface M c R* 
with outward pointing normal v and y € M the shape operator 5, at y is defined 
by 


Sy(Xy, Zy) = -(Xy, Vz,v), 


where V represents the usual covariant derivative on IR*. 


LEMMA 3.1. Suppose Fp(y) = d(y, 8D) for some C? domain D. Then, 
V Fp(y) = Td(y,D)vppvy) (VP56») 


where Tx: 3 * (T,IR^) — 3 * (T, IR^) represents translation from one tangent 
space to another. Further, let Xy(y), ..., A4 1(y) be the eigenvalues of the shape 
operator Sy acting on T,F -i(d(y, D)}; then 


Ai(Ppo(y)) 


AU) 7 17 dG, Du 


Lemma 3.1 describes the geometry of the hypersurfaces at distance r from D 
and can be used to derive Steiner's formula, Theorem 3.2, which gives an ex- 
pression for the Lebesgue volume of the tube T (D, r) around D, in terms of the 
Lipschitz-Killing curvature measures (L; (D, -))o0« j «x of D. The interested reader 
is referred to [11—13, 16, 19] for additional details. 


A GAUSSIAN KINEMATIC FORMULA 141 
THEOREM 3.2 (Steiner’s formula). If we set 
A, = {(p, Xp) e N(OD): pE ANID, Xp =rvy}, 


where N (8 D) is the normal bundle of 8 D in RÝ, then Steiner’ s formula states that 
forr « rc(D) 


n jc 


; 
(3.1) J6—1 (eXpap (A7)) = 2 (j — 1)! 





AC?* CD, A). 


In the next corollary, which is just an application of Steiner's formula, we derive 
a formal Taylor series expansion of the integral of smooth functions over T (D, r). 
The functions are assumed to be in 4(R5), the Schwartz space of functions on RF. 


COROLLARY 3.3. Suppose D is a C? domain in R* with critical radius 
rc(D) > 0. Further, suppose D satisfies, for any 0 < j <k, 


1 AC? 
D,dx) « oo, 
la puc a 


for some B > 0. Then, for any n, r < min(r,(D), D), e € 8(R*) there exist 
(Ci(D; 9))i»o and a constant Kn (D, Q) such that 


n+k l 


r 
dà -f dh Ci (D; 
|o, t 7 J nt Yr eiose|- 


p3 


a aie n(D, 9). 


(3.2) 








Specifically, 


Ci(D; 9) = Ef (E ) 5-1], MR (D, dx), 
mz: 
where a /dv represents differentiation in the direction of the outward unit normal v 
and M = (D, -) is defined to be zero for j > k. 


PROOF. As ọ € 4(R*) for n > 0, at each p € OD, g(expap(p,. rvp)) can be 
expressed as 


pti d'tio 


rJ dio PR dd 
pt Dd Lag py 


(3.3) 9 (expap(p. rvp)) = D "I dvi 





where a(r, p) = p --0(r, p)rvp for some 0 :[0, r.(D)] x dD > (0, 1). 


142 J. E. TAYLOR 


Combining (3.1) and (3.3) we see 


I oy) 436.10) 
CXp3 D (Aj) 


nik-l „l-l qn d" 
3 52: f dv" |, 


[0 





Mj (D, dp) 


und , pl d'tio 
tl J Ed 1 (D, dp) 
2- A (n+ 1)! dv! log, p) MED, dp 











and 

ka] xd zm 

l 4 d"*Q jui 

d ——| MAD d 
2 Pea. d 41 G2; dp) 
pati 
= Ck max max sup( ap) 
~@+D! + 1)! : 





Er diez. p)|f Mo (D, dp) 


n+l 


(n 4- 1)! 


Integrating over 0D and [0,r] we are left with (3.2), which concludes the 
proof. Lj] 


= 





K,(D, p). 


The following corollary gives the expression for C; (D; 9) when ¢ is the density 
of ype with respect to Ag, that is, 


l 2 
— “hlt 
Pw =y x yn : 
The G;(D; y) are coefficients in the promised kn series expansion of the 
Gaussian volume of the tube T (D, r). 


COROLLARY 3.4. If ype is the canonical Gaussian measure on R^, we define, 
forl>1, 


MY (p) & &j(D; 2)" 247 IT 72) 


iat cn) 
m! 


1 


x Hy ((p, vp))e PPM (D, dp) 


A GAUSSIAN KINEMATIC FORMULA 143 


and 
ME (D) Ê yg (D). 
If, further, 


1 AC? 
———— D,dp) < oo, 
Lo ppp A 


for some p > 0, then the following formal expansion holds: 


oo rj a 
Yet (T(D, r)) = Yg (D) + 9 nula (D). 
j=1* 


3.1. Normalization of M; “Rk (). We conclude this section by showing that the 
functionals M ag (-) can be n to be normalized independent of the dimension k, 
and can thus be extended, at least formally, to sets in RN. Because of this nor- 
malization, we would be justified in writing MY instead of M"&* in what follows, 
though we keep the latter notation in order to minimize confusion. 

For a finite subset 4 of N, we denote the projection from RÀ onto R^ by zy. 
For D c R*, with k = #4, we can define the functionals AC, (-) on cylindrical sets 


of the form ihe (D) as 
MY (n, (D) & AG (D). 


The functionals M; () can then hypothetically be extended by taking limits of 
cylindrical sets, though we will not pursue this matter here, except to show that 
these limits exist for at least j = 0, 1, 2. Of course, the functional Mj is well 
defined, since it is just the infinite-dimensional i.i.d. Gaussian measure on RÙ. 

For j > 0, however, it is not immediately clear that the extension is possible. 
The following corollary, which is really just an application of the coarea formula 
in [5] and whose proof we omit, shows why such extensions are posae at least 
in the cases j — 1,2. By the coarea formula the expression for AD (D) is the 
integral of what is referred to as the mean Gaussian curvature of 3 D in [4], and can 
be extended to codimension-1 submanifolds of Wiener space and other infinite- 
dimensional Gaussian measure spaces. 


COROLLARY 3.5. Suppose the C? domain D is given by F—'[u, +00). If both 
AGR* (E71 a, +00)) and M5"  (F-[u, +00)) < oo, then 


AGE (F7 [u, +00)) = ELI VF | | F = ulr (u), 


V2F(VF, VF) 


AGE (F7 [u, +00)) = E|-LF +— VFP 


|F =u] or) 


144 J. E. TAYLOR 


where qp(-) is the density of the random variable F(Z1,..., Zk) for Zi, i.i.d. 
N(0,1) and 


ais 


VM 
-X 


QOXk 


LF(x) — e 


i=] 








is the Ornstein-Uhlenbeck operator on C : PN 


4. EC densities and Weyl-Steiner formulae: the Gaussian KFF. In Sec- 
tion 2 we derived an expression for the EC densities of Gaussian-related fields 
involving the integral of certain functions G, ;. r over the surface F—!{u}. In Sec- 
tion 3 we derived expressions for coefficients in a formal series expansion of the 
canonical Gaussian volume of a tube around a C? domain D. In this section we 
will show that these expressions agree, up to a constant. Specifically, we show that 
for suitable F € C?(R*) 


P; (E, u) = (20) IMI (Fu, +00) 


which by Lemma 2.5 translates into the following Gaussian KFF: 


(4.1) f. x(M N yo)! D)dP(o) = YQ) IP Li (M) AG™ (D). 
j=0 


THEOREM 4.1 (Gaussian KFF). Suppose y = (yi,..., yx) are i.i.d. real- 
valued zero-mean unit-variance suitably regular (cf. [18]) Gaussian random fields 
on a C? n-dimensional manifold M that satisfy 


P| sup IB) — s olo > e| =o") 
qe€Bi(p,h) 


for any h < ho and any € > 0 where 


~ 2 
Ji (P) = (yi(p), Vy EG), V^yig(p)) ER x R” x® R”, 
(x, V, Hla = xl + MV lir + Allez pn 
and V fe (resp. V? fg) are the coefficients of V f (resp. V? f) "Ead off in a fixed 
orthonormal frame (E, ..., En). 


Let fp = Fp o y where F p is the distance function of a C? domain in R*, not 
necessarily compact, with r.(D) > 0 which satisfies: 


(i) the functions 
o=] erx? dH) 
Fp {z} 


are continuous on some neighborhood of zero where v is the outward pointing unit 
normal vector field; 


A GAUSSIAN KINEMATIC FORMULA 145 


(ii) the second fundamental form S of 0D is bounded, that is, 
SCX, Y)x| < K | Xx || Yel 
for some K > 0 and all x € aD. 
Then, 


E[x (M n fp [0, +00))] = E[x (M. UT 
(4.2) 


D (Main m AC? (D). 


In particular, the above relation holds for every compact C? domain D, and its 
complement D°. 


REMARK. The conditions above are not necessary; indeed, there are cases 
such as the F field defined in [20] (cf. Section 5.3) where the boundedness condi- 
tion on the second fundamental form of the boundary of the domain is not satisfied 
but the EC densities exist and are given by 


n--1— —1 I 
De pr ("7 1) im Efron lY FIGO] 


However, it should be noted that in this case, the above coefficients are not the true 
. coefficients in the corresponding power series, because the domain D,,, defined in 
Section 5.3, has re(D,,) = 0. 


REMARK. Using the generalized Morse theorem of [16] or stratified Morse 
theory [3], Theorem 4.1 can be extended to include piecewise smooth domains 
and/or submanifolds of IR^ (i.e., the domains in the space where the random fields 
take values). We give an example of a piecewise smooth convex domain D in 
Section 5.4, where we calculate the MR (D) for D a cone in R? and show that 
the result agrees with known results for right-angled cones [24]. One approach to 
these generalizations, following the notions of continuity of the Lipschitz- Killing 
curvature measures as in [11], would be to construct a limiting argument to justify 
the TOLONIIE computation for certain smooth D sets with positive critical radius 
in R^: 


[ x(Mn y(w)!D) dP(w) 


= lim im f x (M n yo)! TC, r)) dP(o) 


- lim » £M) — M (T(,r)) => Li(M)-— M;™ (D). 


On yn Qn ya 


146 J. E. TAYLOR 


In the interest of brevity, we will not pursue these generalizations here. The above 
theorem, however, does specify the functional form that all of these generalizations 
should have, that is, the contribution of the parameter space is in the form of the 
Lipschitz- Killing curvature measures, which can be defined for piecewise C^ sub- 
manifolds of C? manifolds, and the contribution from the Gaussian space is in the 
form of the coefficients in an expansion for the volume of a Gaussian tube, which 
can similarly be defined for piecewise C^ manifolds in R*. For a more geometric 
approach to the above problem, see [3] which shows that (4.2) can be thought of 
as a limit of the classical KFF. 


REMARK. Note that the conditions on the y;’s are not overly restrictive 
(cf. [18]), as C? fields whose second derivatives have a covariance function sat- 
isfying the “usual” 1/(—log(h))!*? moduli of continuity conditions are included 
above. 


PROOF. Most of the work has been done in the preliminary lemmas. The con- 
clusion | 


Pj (Fp, u) = (20) I? AG (D) 


follows from the fact that fp satisfies the conditions of Lemmas 2.5 and 2.6. 
Lemma 3.1 ensures that the second fundamental form, or shape operator S of dD 
is bounded. Verifying the equality above then follows from comparing the defini- 
tions in Corollary 3.4 of 49" (D) and those of Gn, Pp (0) in Lemma 2.6, noting 
that for x € 9D and all X,, Y, E T, aD, 


V Fp(x) ncm Vy, 
V^Fp(X,, Yz) = —S(Xx, Yx). 


The fact that the conditions of Lemma 2.5 are satisfied follows from Lemmas 
2.4 and 2.5 of [18] combined with the growth conditions on y; and its derivatives 
as well as the boundedness of S. As for the conditions of Lemma 2.6, since Fp 
is a distance function of a set with positive critical radius, conditions (11) and (iv) 
(cf. [11]) of Lemma 2.6 are automatically satisfied. A straightforward calcula- 
tion shows that assumptions (i) and (ii) above imply conditions (ii) and (iii) of 
Lemma 2.6. O 


5. Examples. Throughout this section, y = (y1, ..-, yk): M —> R* will de- 
note a generic sequence of i.i.d. zero-mean unit-variance Gaussian fields on M 
satisfying the conditions of Theorem 4.1, whose dimension k will vary as needed 
in each example. 


A GAUSSIAN KINEMATIC FORMULA 147 


5.1. Real-valued Gaussian processes. Given a unit vector z € RÉ, we define 
the function V, (x) = (x, z), so that sj, o y is a real-valued, centered unit-variance 
Gaussian random field. As mentioned in Section 2, the EC densities of yz o y are 
given, for j > 1, by 


T 1 m 
(5.1) Bio") = c rpg time" ^ 


This result can easily be rederived in light of Theorem 4.1 as follows. Let D, = 
V. 1 (fu, +00)) be a half-space in R^. Clearly 


T (Du, r) = V; [u — r, +00) = Dir, 
that is, a tube around a half-space in IR^ is another half-space. As 
di 


SE gr 2 v anf ir. —x? [2 
Tit =: (—1)! Hj(x)e 7 


it follows that 


yg (T(Du, r)) 21— (u — r) 


© (py (=p! 
=1- Na y mn 





1 
MP (Dy) = ——H j1 (u)e ^" 72, 


y 27 
which, by Theorem 4.1, implies (5.1). 


5.2. The x? and noncentral x? case. In this subsection we derive the EC den- 
sities of a Xi random field, as defined in [1, 20]. Note that the EC densities were 
also derived in [20], but we rederive them here as a simple application of Theo- 
rem 4.1. We set G(x) = |x||?; then g = Go y isa xc random field on M. Note 
that the EC densities of ,/g are related to the EC densities of g by 


9g, iu) = Bj (WG, Ju) = p; (G, u) = pg, j(u), 


so that it suffices to calculate the EC densities of ./g. 
We set 


D, = VG [x, +00) = RA Bg (0, x), 


148 J. E. TAYLOR 
so that T (Dy, r} = D,..,. Therefore, 
YR (T(D,, r)) = Yp& (Dy, ). 


It remains to express the right-hand side above as a Taylor series in r. 
The density fi = d (V Gx (Ygx))/dAm of the square root of a Xi random variable 
is 





Tux) rmx te 
and 
V di^! fy 
Yre (Dx—r) = Yk (Dx) — 5. ^C pit 4 aer. 
j=l! 
Direct calculations show that 
di— fy "Jum LG=D/2] j-1-21 


1 k—1 
Wet A ood ein ios] 


(—1)” + (j — D! 2m2 
mM 2! 


dtj-1 ~ P(k/2)2&-2/2 


m 


Combining the two equations gives 
xe 2 
Yre (Dx-r) = yg (Dx) + a F AEG (337332 


LG—0/2] j-1—21 pod 
«( 2). 25 14j-n-an( , ) 
i=: m=0 


j-l+mtle3 yt 
3 G =! a 


mM t2! 
We thus conclude, by Theorem 4.1, that for j > 1, 

xk-j ex 
Qz)J?r(k/2)2*—2^? 


LU-0/2] j-1-21 "E 
< » X Lj-n-an( ) 
I=0  m=0 


B;(VG, x) = 


(—1)/- mH (j — 1)! ym \ 
mila! 


A GAUSSIAN KINEMATIC FORMULA 149 


and thus the EC densities for j > 1 of the Xt random field g are given by 


y Du 


PGW) = S PET 2326-9 
LU-2/2] j-1-2l ped 
< 2 22 1usi-n-m( 5 i m u 
l0 m=0 
, = 1- aa =)! m " 
m2! i 
which agrees with [20]. 


Using the above formula for the x í EC densities, we can derive the EC densities 
of a noncentral x^ which we define to be a Gaussian-related field with the function 


G,:R >R 
G,G) — Illy — uM. 


Recalling that the density fj; of the square root of a noncentral XP random vari- 
able with noncentrality parameter a can be we as 


E k+2j (x), 


fa kx) = Y ae 


j= 


2i j! 


where f(x) is as above, the density of a xà random variable, and noting that, in 
our case o = ||ul|?, we obtain the following. 


LEMMA 5.1. 


(G "M aware —— 4 Deo 
ied Qi! Qx)JPT((k 4-2i1)/2)2€2i—52 


. wy NC ) 


Lik>j-m-21-2i} (, 1m7] 


M in (jiita j — Sen 


l=0 m=0 


mM! 


5.3. The F case. In this subsection we derive the EC densities of an Fy, x, 
random field, as defined in [20]. Unlike the previous two subsections, instead of 
using the representation of the EC densities as coefficients in a Taylor series ex- 
pansion of the volume of certain tubular neighborhoods, for the Fx, x, field we use 


the expression for ACRI "m (-) given in Corollary 3.4. 


150 J. E. TAYLOR 


We set 
k2 a iz] (yi? 
ki x pons (ye +i)? 


and define the Fy, 4, random fields f by f = F o y. Setting D, = F -l[u, +00) 
we see 


F(y) — 


9D, =F Mu) — U Suus (R9) x S GU). 
reRt 
The definition of the functionals ( t mes 
y) — 0 for all y e R^ and 


(-));>0, along with the fact (V F(y), 


0, "Obi n odd, 
Hy (0) = jy = 


imply that 
Qn) B; (F,u) = MP"? (Dy) 
LG=D/2) p44 
=(j—1)! b» ns NN, NR 
= 2! (Qn) Kira) /2 


x | e IF M; (Du; dx) 
aD, 


(5.2) | 
LG—1)/2] (—1) 1 


w a a ! Nu ABE PRECES DES R 
—(j— 1) 2. ID Gaeta? 


X | Qo 1 
9D, (j —21 — 1) 
x Tr? Pu (Sic t) d JCc e, 1 (x). 
We now proceed to evaluate, for 0 < m < kı 4- k2 — 1, 


Tr°PF (s? )(y) = T?Pro (iv Fy) II! V? FG)tpzg)") 


in terms of U(y) = Gey and V(y) = Taar and G(y) = U(y)/ 
VO). 


LEMMA 5.2. We have the following expression for Tr°PF (S35 xe 


1 1 T 
IN spp) = — (IVF I V^Fiops)" ) 


= I "5, m-i ci [ki — 1 k5—]1 
- (ovem) D » vw Y i ) 


A GAUSSIAN KINEMATIC FORMULA 15] 


Further, 


l l 9D. -|xi? 
uium. - — '[r (Sap Je f d Jc i, — 1(x) 


— T (a +k: —m — 1)/2) (Re) P ( 7 oe 
—O2m-DPT (1 /2)7 (5/2) k2 


k2 
«Mc D" (m) EU 


PROOF. Since F = koU/k,V, a straightforward calculation shows 


2 ] 
IVF |= G(1 +G). JW 
"Hc AE en dried 
1 
VF UVF = TLS 
x (zv = nv & dU -- dU &dV) 
+ m dV &dV — VV), 


We now evaluate the matrix of || V F|| ! V?F in a specific set of orthonormal 
frames of V F+. Considering R“'+” to be the product R“ x IRE with the product 
metric, we fix subspaces L; the kernel of VU in R* and similarly L2, the kernel 
of VV in RE, for which we choose orthonormal frames B4, B2. The desired set of 
frames for the kernel of V F is then B = (Bj, B5, X], where 


| G 1 | 1 1 
T a ee VY 
14+ 62/0 1+G2/V 


The matrix of —|| V F|-1 V?F in this set of frames is diagonal, with entries 


ky —1times ky — times 
PER NM 1 LG G,0 
VGd16). $e. yo og ,Q), 


from which the result follows by expanding the trace of the mth power of such a 
diagonal matrix. 


152 J. E. TAYLOR 


Federer's coarea formula (cf. [5, 11]) implies 


l sad 
aaaea —ixi^/2 8D, fom 
(277) 10/2 I... i "i r (S3p,) d €, (x) 


mm |] ade som 
= lim = fg, UVF Hates) TP" (Zp) dtr sta GO 


= 1 8Dr ( oem LE d F.(Ygki e) 
=E|IVFI— Tr i Jop] = u | etu 


The second conclusion now follows after substituting in the density d Fy. (Ypk +. )/ 
dìg and noting that V(1-- G) —U +V — Xi. dp. is independent of G and 


T'((k1 + k2)/2 + p) 


ENU + VYF] = 2? A 
No eode ec 172) D 


Combining (5.2) and the previous lemma we have, in agreement with [20], the 
following 


LEMMA 5.3. The EC densities for the Fy, k, random field are given by 
(2x)! BF, u) 


 T((a ct E - 1/2) (Bey 77 ( n a tid 
~ .20-9P?T (ky /2)T (0/2) N k2 kz 
LG—1)/2] , 
l'((k1 +k — j))/2 4-1 
x (-1 1G —- 1)! 23 CUm cec 
pO MR +k- 7/201! 


j-2l~1 


. lu i4-l 
x 2. (nin) 


x ( ky ~ 1 ) s — 3 
j-1-2l-i i 
5.4. Cones in R? and correlated conjunctions. In this section we study con- 
junctions of correlated Gaussian fields, which in terms of fields can be defined 
in terms of the minimum of two correlated Gaussian processes. Specifically, 
as in the Introduction, given y = (y1, y2) iid. centered unit-variance Gaussian 


processes on some manifold M, we form two new Gaussian processes as fol- 
lows: 


Z1 = y1, 


z2 =p: yi +41- p? y 


A GAUSSIAN KINEMATIC FORMULA 153 


and define Z; and Z2, our isotropic versions of these processes, as in Section 3. We 
define the conjunction of z; and z2 at a point p € M by 


zı ^ 22(p) = min(zi(p), z2(p)). 
It is easy to see that 
zı Az, [u, +00) =z) [u, +00) N z; ! [u, +00) 
= y" (K(u, p)), 
where K(u, p) is a cone in R? so that, to calculate the EC densities for 
this process, it suffices to calculate the expected Euler characteristic of M N 
y |K for a general cone with arbitrary vertex in R?, which we proceed to 


do. 
Define the cone 


C(v1, v2, w) = {z € R :z = w + aivi + azv, 01, a2 > 0} 
=w+{zeE TR? :z = d1U| + 4202, 01, a2 > 0}, 


where TR? is the tangent space to R? at w. A sketch of the cone appears in 
Figure 1. We derive an expression for the quantity 


M (C(v1, v2, w)) 
which, by Theorem 4.1 and the remarks following it, allows us to compute 
El x (Mn y 1C(v, v, w))]. 
Associated to C(v1, v2, w) is its normal cone 
Cl (vi, v2, w) = {Z € TR? : (z, v1) < 0, (z, v2) < 0}, 
with link 
L(C (vi, vo, w)) = {z € S(T,R^): (z, v1) « 0, (, v2) < 0}, 


where S(T R?) is the unit sphere in T,,R?. 
In this case, the domain C (v1, v2, w) is not smooth; however, for any ô > 0, 
T (C (vi, v2, w), 8) is at least C! and the coefficients of 


yg2(T (T (C (v1, v2, w), 8), r)), 


as a power series in r will, for small 6 > 0, be close to those of yg2 (T (C (vi, v2, w), 
r)), that is, for small à > 0, 


E[x (M n y^! C(vi, v2, w))] ~ E[x (M n y! (T (C(vi, v2, w), 8)))] 


1 


EAE E £4 (M) M5 (T (CQ, v2, w), 8)) 


1 


= On Ge Yi (MM; (C, v2, w)), 


154 J. E. TAYLOR 





e (0, 0) 


FIG. 1. Tube of radius r around C(v1, v2, w). 


where, for the cone C (v1, v2, w), M® (C (v1, v2, w)) is defined as the coefficient 
of r/ /j! in a Taylor series expansion of ygz (T (D, r)). The expansion is split up 
into integrals over three regions, depicted in Figure 2. 

Without loss of generality, we can choose a basis of Ty R? so that the coefficients 
of v, are (1, 0) and those of v; are (cos0, sinO) where 0 = cos !((v, v2)). We 
then set vi to be the unit vector orthogonal to v, such that (vi^, v2) > O, that is, 
the coefficients of vi with respect to our chosen basis are (0, 1). The volume of 
the first region, R; (r), is thus 


vig2(Ri(r)) = yga(fz € R^: (up, z) e (vio, w) — 7, (vi, w)], (v1, z) > (v1, w))) 


= (1 mS AS w))) Y : (uy, wye "rw /2 
j=l“ ° 


A GAUSSIAN KINEMATIC FORMULA 155 





T (0, 0) 


FIG. 2. Regions of integration for power series expansion of yg2 (T (C (v1, v2, w))). 


By symmetry, 


Vigo (Ra (r)) = yg((z € R^: (uy, z) e [luz w) — r, (uy, w)], (v2, z) > (vo, w)}) 


1 — P((v2, a 
= SSG — Hj—1((vy, w))e 2n ") 


Finally, using polar coordinates 





pit] gqig- liz 7/2 
2(Ro(r)) = u —— ——————— dt dv 
in l " En 0,r]x L(CÀ- (vi v, w)) j! dvl zy 
ASI di 25-2 
«af EU x 
2n pen. L(Cl(w,v,w) dvi T" 





156 J. E. TAYLOR 








Writing v = —sinÓ - vı — cosÓ - vt, we have 
LH 2 , l 
di—2e-llzl/2 med NETT : 
TCR = (-1)/~* (sind + cos] e WA 
dv! Z=W dv dvi EN 
j-?2 ,, ? 
1—0 


= 2 
x Hj 24v, w)) Hi((vi, w))e "1 72. 


Noting that v € L(C* (v4, v2, w)) if and only ifü c (O, 7 — 0) we see 
1 CO rj j-2 ANN) 
Ygz (Ra (7)) = on 2 E > G-I) P ] | 
de Oe a 


MY 2 
x Hj-2-4((v1, w)) Hii, w))e 7 


T= s 62-6 LX az 
x | sin? ^ 8 cos’ 0 d0. 
0 


Straightforward calculations show 





Kj1(0) — (j — 1) rf sint? 6 cos! 0 dà 
Lu By-1-n2,d+yr(vsin8), if 0 > /2, 
= yig, j-1-D/2,04)/2 — I Bj-1-0/2,04-)/2(Vsin8 ) 
+i A By-1-p/2,04/2 —— . 0 «272, 
where 


X 
IBy, v (X) = i mig — pl gr 
0 


is the incomplete beta function and By, y, = Z By, ,, (1) is the beta function. 
Putting the above together, we have proved the following. 


LEMMA 5.4. The coefficient of rJ / j1 in the power series expansion of 


Vg (T (C (vi, v2, w), r)) 


A GAUSSIAN KINEMATIC FORMULA 157 
is given by 


MF (Cui, v2, w)) 


YR2 (C(u, U2; w)), j — 0, 
(17 9(vi,w)) -wtw , (17 o2 WI) tup 
Vf 2x A/ 2t = 
ped 
COE n iu. wye- vv? 
- A 2t 
" DOTEM H;i (wF, we ebur 
^/ LTE 
DM /j-2 L yz lwl?,2 
+> ( jo) Kj Hj-a-in, we) Hir oye MPP, 
l0 : 


j 22. 


It remains only to relate the above lemma to our original goal, that is, the EC 
densities of the field z1 ^ z2, which amounts to determining v1, v2 and w for the 


cone K (u, p). We can take vi = (1, 0) and vl —(p,Jl— 0?) so that vı = (0, 1) 
and v2 = (4/1 — p?, —p) and w = (u,u/./1+ p). 


Acknowledgments. The author would like to thank Drs. Robert Adler and 
Keith Worsley for their help and encouragement as thesis advisors, which is where 
this work originated. 


REFERENCES 


[1] ADLER, R. J. (1981). The Geometry of Random Fields. Wiley, Chichester. MR611857 

[2] ADLER, R. J. (2000). On excursion sets, tube formulae, and maxima of random fields. Ann. 
Appl. Probab. 10 1—74. MR1765203 

[3] ADLER, R. J. and TAYLOR, J. E. (2006). Random Fields and Geometry. Birkhäuser, Boston. 
To appear. Preliminary versions of Chapters 1—12 available at http://iew3.technion.ac.il/ 
-radler/grf.pdf. 

[4] AIRAULT, H. (1991). Differential calculus on finite codimensional submanifolds of the Wiener 
space—the divergence operator. J. Funct. Anal. 100 291—316. MR1125228 

[5] AIRAULT, H. and MALLIAVIN, P. (1988). Intégration géométrique sur l'espace de Wiener. 
Bull. Sci. Math. (2) 112 3-52. MR942797 

[6] ALDOUS, D. (1989). Probability Approximations via the Poisson Clumping Heuristic. 
Springer, New York. MR969362 

[7] BERNIG, A. and BROCKER, L. (2002). Lipschitz-Killing invariants. Math. Nachr. 245 5-25. 
MR1936341 

[8] Cao, J. and WORSLEY, K. J. (1999). The detection of local shape changes via the geometry 
of Hotelling's T? fields. Ann. Statist. 27 925-942. MR1724036 

[9] CHERN, S.-S. (1944). A simple intrinsic proof of the Gauss-Bonnet formula for closed 
Riemannian manifolds. Ann. of Math. (2) 45 747—752. MR11027 


158 J. E. TAYLOR 


[10] CHERN, S.-S. (1966). On the kinematic formula in integral geometry. J. Math. Mech. 16 
101-118. MR198406 

[11] FEDERER, H. (1959). Curvature measures. Trans. Amer. Math. Soc. 93 418—491. MR110078 

[12] GRAY, A. (1990). Tubes. Addison-Wesley Publishing Company Advanced Book Program, 
Redwood City, CA. MR1044996 

[13] KLAIN, D. A. and ROTA, G.-C. (1997). Introduction to Geometric Probability. Cambridge 
Univ. Press. MR1608265 

[14] SCHNEIDER, R. (1993). Convex Bodies: The Brunn-Minkowski Theory. Cambridge Univ. 
Press. MR1216521 

[15] SIEGMUND, D. O. and WORSLEY, K. J. (1995). Testing for a signal with unknown location 
and scale in a stationary Gaussian random field. Ann. Statist. 23 608-639. MR1332585 

[16] TAKEMURA, A. and KURIKI, S. (2002). Maximum of Gaussian field on piecewise smooth 
domain: Equivalence of tube method and Euler characteristic method. Ann. Appl. Probab. 
12 768-796. MR1910648 

[17] TAYLOR, J., TAKEMURA, A. and ADLER, R. (2005). Validity of the expected Euler character- 
istic heuristic. Ann. Probab. 33 1362-1396. MR2150192 

[18] TAYLOR, J. E. and ADLER, R. J. (2003). Euler characteristics for Gaussian fields on manifolds. 
Ann. Probab. 34 533—563. MR1964940 

[19] WEYL, H. (1939). On the volume of tubes. Amer. J. Math. 61 461—472. MR1507388 

[20] WORSLEY, K. J. (1994). Local maxima and the expected Euler characteristic of excursion sets 
of x”, F and t fields. Adv. in Appl. Probab. 26 13-42. MR1260300 

[21] WORSLEY, K. J. (1995). Boundary corrections for the expected Euler characteristic of excur- 
sion sets of random fields, with an application to astrophysics. Adv. in Appl. Probab, 27 
943-959. MR1358902 

[22] WORSLEY, K. J. (1995). Estimating the number of peaks in a random field using the Had- 
wiger characteristic of excursion sets, with applications to medical images. Ann. Statist. 
23 640—669. MR1332586 ' 

[23] WORSLEY, K. J. (1996). The geometry of random images. Chance 9 27-40. 

[24] WORSLEY, K. J. and FRISTON, K. J. (2000). A test for a conjunction. Statist. Probab. Lett. 47 
135-140. MR1747100 


DEPARTMENT OF STATISTICS 
SEQUOIA HALL 

STANFORD UNIVERSITY 

STANFORD, CALIFORNIA 94305-4065 
USA 

E-MAIL: jtaylor@stat.stanford.edu 


The Annals of Probability 

2006, Vol. 34, No. 1, 159-180 

DOE: 10.1214/009117905000000288 

Q Institute of Mathematical Statistics, 2006 


NOTES ON THE TWO-DIMENSIONAL FRACTIONAL 
BROWNIAN MOTION 


By FABRICE BAUDOIN AND DAVID NUALART! 
Toulouse University of Technology and Universitat de Barcelona 


We study the two-dimensional fractional Brownian motion with Hurst 
parameter H > l. In particular, we show, using stochastic calculus, that this 
process admits a skew-product decomposition and deduce from this represen- 
tation some asymptotic properties of the motion. 


1. Introduction. We consider a complex fractional Brownian motion started 
at zo = xl + ix^ e C*, 
B; =B} +iB?, t0, 


where B! and B? are two independent fractional Brownian motions with the same 
parameter H € (0, 1). That is, for i = 1,2, B' = (Bj, t > 0} is a Gaussian process 
with mean E(B;) = x' and covariance 


R(t, s) = 4877 + P8 — t-s”). 


Notice that, in the case H = j, B is a planar Brownian motion. In what follows, 


we shall always assume that H > i. In this case, it is known that B is a transient 
process (see [20]). 

The aim of this paper is to use the stochastic calculus for the fractional Brownian 
motion which has been recently developed by several authors (see, e.g., [1, 2, 4, 
11]) in order to obtain geometric properties of this motion as it has been made in 
the case of the planar Brownian motion (see, e.g., [6, 8, 12, 13, 17, 18]). The key 
point of our study is an analogue of the celebrated skew-product decomposition of 
the two-dimensional Brownian motion. Precisely, we show that we can write 


t dB 
B, = zoexp( | ‘), t>0, 
0 B; 


where the integral of the right-hand side can be understood not only path-wise, but 
also as a local divergence. We shall see that it is then possible to apply the ergodic 
theorem to study the asymptotics of the integral 


t dB; 
1.1 , 
(1.1) o B. 








Received March 2003; revised October 2004. 
lSupported by DGES Grant BFM2000-0598. 
AMS 2000 subject classifications. 60F15, 60G15, 60G18, 60H05. 
Key words and phrases. Ergodic theorem, functionals of fractional Brownian motion, planar frac- 
tional Brownian motion, stochastic integrals, windings. 


159 


160 E. BAUDOIN AND D. NUALART 


whose real part (resp. the imaginary part) explains the radial part of B (resp. the 
angular part of B). 

In that respect, the study of the two-dimensional fractional Brownian motion 
with Hurst parameter H > 5 could seem simpler than the study of the planar 
Brownian motion, for which it is not possible to apply directly the ergodic theorem. 
Nevertheless, for the fractional Brownian motion, we shall see that the study of the 
windings is much more difficult because the integral (1.1) 1s not a time-changed 
fractional Brownian motion. 


2. Itó's formula for holomorphic functions. Before we study the skew- 
product decomposition, we need to develop stochastic calculus for the two- 
dimensional fractional Brownian motion. In particular, we shall need a two- 
dimensional Itó's formula. 

Let # be the Hilbert space defined as the closure of the set of step functions on 
R with respect to the scalar product 


(10,5, 100,5]) g = R(t, s). 


For i = 1,2, the mapping 1jo,4, — B] — x! can be extended to an isometry be- 
tween J€ and the first chaos Hj isocialed with B'. We denote this isometry by 
p — B' (p). If p € H®, we set B(g) = (B' (pl), B^(g^)). 

Let 4 be the set of smooth and cylindrical random variables of the form 


(2.1) F = f (B(gi), ..., B@n)), 


where n > 1, f € Cz? (R^") (f and all its partial derivatives are bounded), and 
pi € HSÞ. The derivative operator D of a smooth and cylindrical random variable 
F of the form (2.1) is defined as the J£9?-valued random variable 


DF = Y (SF we, 5 De]. 
j=l 
where Z = (B(q1),..., B(@n)), and we make use of the notation f(z1,...,2Zn) 
and z; = (xj, yj), for j — 1,...,n. The derivative operator D is then a closable 
operator from L? (Q) into L?(Q; #82) for any p > 1. For any p > 1, the Sobolev 
space ID. is the closure of 4 with respect to the norm 


IF itp =EIF|? -EIDFI. 


In a similar way, given a Hilbert space V , we denote by D? (V) the corresponding 
Sobolev space of V -valued random variables. 

The divergence operator 6 is the adjoint of the derivative operator, defined by 
means of the duality relationship 


(2.2) E(F0(u)) = E(DF, u) e2, 


PLANAR FRACTIONAL BROWNIAN MOTION 161 


where u is a random variable in L7(Q; 368%). That is, we say that u belongs to the 
domain of the operator 5, denoted by Dom ô, if the mapping F > E(DF, u) 4e 
is continuous in L*(Q). If u € Dom ô is a two-dimensional stochastic process, we 
will make use of the notation 


oo CO 
5a) = | 3B; + f u? 8B. 
0 


The space D12 (H82) is included in Dom ô. For an element u in D!:? (368°), we 
have 


(2.3) E(5(u))* = Ellul? + E(Du, (Du)*) ges. 


We recall that # contains the space of real-valued functions g on R4 such that 


Í j | - Ip) Ilo(s) [lt — s|- at ds < oo. 
0 JO 


The following result has been proved in [2]. 


PROPOSITION 1. Let u = (u;)s>0 be a two-dimensional stochastic process in 
the space ID^?(3€9?) such that, for some T > 0, 


T pT 
&(f | hullusllt — s? dt ds ) < oo. 
0 JO 


E | ,|Drugl|Dsunllt — s 7219 — n?” -dt du d dn < 00 


, 


and 
T eT 
Lf Dust- ds dt « oo, 


a.s. Then the symmetric integral defined as the limit in probability 


T T 
4B, =lim(2e)"! | B — Bg- 
f Ura Dt A) £) A us( (s2-&e)AT (s evo) ds 

exists and we have 

T Pom 
(2.4) Í E = 8(u) box | Í (Dlul + D2u2)|t — s? qs qt, 

0 Jo 

where ag — H (2H — 1). 


We denote by D^ (H82) the set of J€9?-valued random variables u such that 


there exists a sequence ((Q", u"^)), n > 1] C F x IDE-?(3692) such that Q” 4 Q 
a.s. and u = u”, a.e. on [0, T] x Qa. We then say that {(Q”, u”)} localizes u in 
D?(3t9?). If ue Di? (H82), by the local property of the divergence operator, 
we can define without ambiguity ó(u) by setting 


(u)|g» = 8(u") |r 


162 E. BAUDOIN AND D. NUALART 


for each n > 1, where ((€2", u”)} is a localizing sequence for u in D'?(J082), 

If f is a real-valued function on R? twice continuously differentiable, then d 
each t > 0, V f (B.)1(o, belongs to Dy’*( #2") and the following version of Itó 
formula holds (see [2]): 


fB) = f(x!, x2) + [5 (B. 8B! 4+ l5 OT (p.p? 
dy 


+H [ (= (B) af 55089); 2-1 dg, 


On the other hand, we will denote by ho us d B; the Riemann-Stieltjes integral 
of a stochastic process u; whose trajectories are y -Hólder continuous, provided 
y > 1-H (see [21)). 

If f:C— C is a twice continuously differentiable function in the variables 
(x, y) > f(x +iy), formula (2.5) leads to 


f (Br) = f (zo) + [5 2-8) 5B! 


(2.5) 


(2.6) 
z^ : 2H-1 
+f 3; (Bs) BB M. Af (Bs)s ds, 


where $£ = 3(36 — i55) and = 1E +96). 
PROPOSITION 2. Let f :C — C be a holomorphic function, then 


t 
(2.7) f(B) = f (zo) + Í f (Bs) 8B; 


1 
(2.8) = f (zo) + Í f (B)4B,, 


where f'(z) = 9f 


PROOF. Equation (2.7) follows from (2.5) and the fact that of = 0 and 
Af = 0. Equation (2.8) follows from the change of variables formula for integrals 
with respect to Hólder-continuous functions. O 


REMARK 3. Unlike two-dimensional Brownian motion, two-dimensional 
fractional Brownian motion is not conformal invariant. It means that, in general, 
Ja f'(Bs) 8B, is not a time-changed fractional Brownian motion. Nevertheless, let 
us recall that, in [3], it is shown that, for a deterministic function, g which satis- 
fies some regularity assumptions and for a one-dimensional fractional Brownian 
motion B with Hurst parameter H > l, we can write a decomposition 


t 
Í g(s)dBP = Bj eit ds +A;, t20, 


PLANAR FRACTIONAL BROWNIAN MOTION 163 


where A; is a bounded variation process and 6” a fractional Brownian motion with 
Hurst parameter H. Hence, a natural question is to ask if such a decomposition 
exists (with the suitable time-change) for the two-dimensional process f (B;) or 
not, but so far there is no answer yet to this question. 


As an immediate corollary of the previous proposition, we also deduce the fol- 
lowing: 


COROLLARY 4. Leta = f(z)dz be a holomorphic one-form on C, then 
t t | l 
f am] fGosB.- [| f'G)4B. 
B(0,1) 0 0 
where f B(0,t) & denotes the integral of a along the trajectories of B. 


REMARK 5. We recall that we can define the DEEA of a holomorphic one- 
form on any continuous path. 


REMARK 6. The previous equality is interesting because it relates objects of 
different natures. 


From now on, since the different integrals coincide for holomorphic functions, 
we shall use indifferently the notation ô or d. 
As a consequence of Itó's formula, we can deduce the following. 


PROPOSITION 7. Let f :C— C bea holomorphic function, then 


E((Im i l £&345.) ) - B( (Re Í l £845.) ) 
E((Im f ! f (Bs) dB, ) (Re Í ' £()4B;)) esi) 


PROOF. Infact, this is a consequence of Itó's formula. Indeed, let us consider 
a holomorphic function F such that F — f'. From Itó's formula, we bave to show 
that 


and 


E((Im(F(B;) — F(zo))))) = E((Re(F(Bj) — F(zo))))) 
and | 
E((Im(F(B;) — F(zo)) (Re(F(Bj) — F(zo)))) = 
Now, notice that after a suitable scaling, these equalities only involve the centered 
reduced normal law, and are, hence, easily checked. C] 


The following proposition provides the skew-product decomposition for the 
fBm. 


164 F. BAUDOIN AND D. NUALART 


PROPOSITION 8 ("Skew-product decomposition for the fBm"). We have 


t 6B t dB 
B, = Zo ex (f =)= (f :) |f20. 
t o EXP o B, Z0 exp o B, >0 


PROOF. If we apply Itó's formula to 


t dB 
Bexe(— | =). 
AY 


we see that it is constant, and so equal to zo. O 








Notice the important fact that the divergence integral fj Ps is local. Indeed, if 
not, we would have 


t $B t Bl5 pl + B25 B2 
we (sf a) (f, ae) =O 
0 B; 0 |B; |? 


and from Itó's formula, 
E(in |B:1) = Edn |zol), 


which is clearly absurd. 


3. Polar decomposition. First of all, we study the polarity of points for the 
complex fractional Brownian motion B issued from zo Æ 0. For this, it is enough 
to adapt the original proof by [6] of the polarity of points for the planar Brownian 
motion. 


PROPOSITION 9. Jf H >, then the points are polar for B, that is, V x € C, 
P(3t >0, B, =x)=0. 


PROOF. We proceed in several steps. We assume H > $ because the case 
H = l, which corresponds to the case of the Brownian motion, is well known. 

In [19] it has proved that a.s. the Hausdorff measure 44, of the fractional Brown- 
ian curve is finite, where (e) = & loglog(1/e). This clearly implies that the 
Lebesgue measure of the fractional Brownian curve is zero. For the sake of com- 
pleteness, we provide a simple and direct proof of this result. Let us denote by u 
the Lebesgue measure on the plane and note that 


E(45,(),0 t «1)) « xE( sup if). 
; Q<t<1 


Hence, 


E({B;(@), 0 x t x 1]) < too. 


PLANAR FRACTIONAL BROWNIAN MOTION 165 


Now, from the scaling property of the fBm, for T c N*, 
E(u{B: (0), 0 x t < T}) = T^" E(u(Bi(),0 <t < 1)). 


But, since the increments are stationary, 


T 
E(u(Bi(0),0 x t x T]) x 3 E(u(Bi(o),i —1<t <i}) 


i=l 
< TE(u{B,(@), 0 <t < 1)). 
We deduce that 
T?H E(u(Br(o),0 <t < 1}) < TE(u{B; (w), 0 x t < 1}), 

which implies 

E(u{B: (o), t > 0)) =0. 
Now, if there exists at least one x € C \ {zo} such that 

P(3t > 0, B; =x) - 0, 
then we claim that, for all z € CV {zo}, 

D(31 > 0, B, =z) » 0. 


This follows from the rotational invariance and the scaling property of the fBm. 
Since 


E(u{B;(), t > 0}) — 0, 
we have, for all z € C {zo}, 
P(3t > 0, B; =z) =0. 
To conclude the proof, we have now to show that 
PEt > 0, B, = z9) =0. 


For this, notice that for, A > 0, the process (B;+a — By):>0 is still a fractional 
Brownian motion (this is a Gaussian process with the same covariance function 
as B). Hence, if we had 


P(at > 0, B; = zo) > 0, 
then, we would have, for k > 0, 
PEt > 0, Br. — Br = zo) > 0, 


which is absurd because zo #0. O 


166 F. BAUDOIN AND D. NUALART 


REMARK 10. It is known (see, e.g., [20]) that, for H e (0, 1), the sample 
paths of B have interior points, so that E(w{B;(@), 0 <t x 1}) > 0. In particular, 
the points are not pon Notice also that, for H € (0, 1], B is recurrent, whereas it 


is transient for H > 5 
Assume now that zo = 0. Since 0 is polar for the complex fBm, we can consider 
for £ > 0 the radial part 
= |Bi| 
and the angular part 
B; 
|B: 


I 


PROPOSITION 11. For each t > 0, O; is independent of the process (ps)s>0 
and uniformly distributed on the unit circle. 


PROOF. Let X be an Roo-measurable random variable, where R is the natural 
filtration of (ps)s>0 and let f be a Borel function on the circle. We know that the 
law of B is rotational invariant, and this implies that, for all 0 € [0, 27r], 


E(Xf (65) =E(Xf (e? ©;)). 
Integrating this relation yields 
1 2x f 
E(X f (8,)) -E(x— I fedo), 


Since the Lebesgue measure on the circle is invariant by translation, we deduce 
that 


1 20 : 
— | fe^» 
2x Jo 
is constant and does not depend on ©;. This implies that 
1 Qn . 
EXP) -EQD— | f(e^ 4e, 
TX JO 
which gives the ee result. U 


4. Asymptotics of the skew-product and limit theorems. We come back to 


the case zo Æ 0 and study the asymptotics of fj “2 = fo a * as t tends to infinity. 


First we present some preliminary results. 





LEMMA 12. For any real number k > 1, we have 


E( a Bs ; < +00. 
] B; — zo 











PLANAR FRACTIONAL BROWNIAN MOTION 167 


PROOF. In order to show this property, we express the path-wise Riemann- 
Stieltjes integral in terms of fractional derivatives, using the fractional integration 
by parts formula (see [22]): 


k dB, p 
-f D (a gon By (s) ds, 


1 Bg, — Zo 





where 1 — H <a < 5, and the left and right fractional derivatives are given by 


PE Fa Gants tel Gate”) 


S (g—)o 


1 (Sr y f 1/(Bs — zo) — 1/(By — zo) iy). 
1 


and 


~Td-a)\ @— De (s — yer 
Clearly, for all p > 1, 
(4.1) E( sup |Di-? Bi. (yi^) < oo 
1<s<k 


On the other hand, for 1 < p < 2, 


(4.2) (fs). 


and 
«(f (f e S y =z) dy) ds) 


k $ iB EN B | P 
E araa ie RNRR ) 
n (/ (J (s — y)**![B, — zoll By — zol ay ds) 


(4.3) S 
ii lar f | Gara Sar up ^4] 
« Ck? (IE(G?4 y) /4 sup (z ( 2 )" < 00. 
t<s<k\ A Bs — zo|??4 
where 


|By — B;| 


G = ———— 
l<y<s<k (s — y)H-s i 


168 F. BAUDOIN AND D. NUALART 


provided 7 + $ = 1, 2pq < 2, and p(w + 1 — H + £) < 1. The estimates (4.1), 
(4.2) and (4.3) imply the desired result. C 


REMARK 13. The previous result is not true for H = $, that is, in the case of 
the planar Brownian motion. 


Now, we prove the ergodicity of the scaling for the fractional Brownian motion. 


LEMMA 14. Fork € R} X {1}, the transformation on the path space Ty :% > 


e oC) which preserves the law of a one-dimensional fBm issued from 0 is ergodic. 


PROOF. Let f be a one-dimensional fractional Brownian motion issued 
from Q. It suffices to show that 7; is mixing. For this, we note that, for s,t > 0 
and n € N*, 





Brn 
B( Asc) = an g8 4. ently 2H —|s eT ids en 


so that 


which implies the mixing property of Tx. O 


Before we om to the skew-product decompone we state a last preliminary 
result on the 7-variation of the integral fj 5; dB: 
on the j-variation of divergence integrals wi application to fractional Bessel 
processes. 





PROPOSITION 15. Set tj = Ë, for i — 0,1, ..., n. Then 


1/H Fo 
— T la, M E E ae 
«| |B, [UH " 


almost surely, as n tends to infinity, where c = E(\B,|!/"). 


1 


3 [^ d B; 
Te B; 











PROOF. For any complex-valued process u; whose paths are locally y -Hólder 
continuous, with y > 1 — H, we set 


n—|] 


Ti+] 
Va (u) = Pz J us dB; 


j0 


1/H 








PLANAR FRACTIONAL BROWNIAN MOTION 169 


Then, by the Hólder inequality, we obtain 








1 n-i 
IV; (u) — Va(v)| S H »( 


i=0 


li. 
Í (us — Vs) dB; 
fi 


tii li 
J us dB, | v; dB; 
ti lj 


< PIT — v)[? (Vau) 7 + Va(v)!74), 


1/H 


(4.4) x T 


) 


If us is a step process, then, from the properties of the {Bm (see [15]), we know 
that V, (u) converges almost surely to c fọ |us|}/% ds. We can write 


1 t ds 1 
V, (x) = ef hd < |Vn (5) — Vim) 














r 


B E 


t t ds 
mar- a 
i lum (S)| S o BE 


= n,m + bs m T Em, 

















H 
Vuln) =c Í lup (s)| 7 ds 


Tc 





where 
Um(t) = 2 Bobet: 
j=0 “4-1 


Using (4.4), we can estimate the first summand as follows: 


1 1 H 1 ]—-H 
sg vn (gm) an.) 
>H 1/H o MB m $ 
: 1 i—H : 1--H 
rm San) vml enan) ) 


where Varı; denotes the total variation of order a on the interval [0, 7]. Then 
Love-Young's inequality (see [21]) implies that lim, sup, a5,» = 0. From the 
previous results, we know that, for any fixed m, lim, bn,m = 0, and clearly, 
lim, Cm = 0. This completes the proof. L] 








We can now turn to the asymptotics of the skew-product decomposition. The 
main ingredient of what follows is the ergodic theorem. 


PROPOSITION 16. We have 
L f'8B; as 
—— | — -> 
logt Jo B, too 


170 F. BAUDOIN AND D. NUALART 


and 





E [ ds as, T-—1/QH) 
logt Jo |B,|/H t>+00 2!/QB) 


PROOF. We first write 
B t = £0 JI P fo 
where Pf is an fBm started at 0. 
The key point is that 
df; ds 
E RP and ae aes 
Bs |, i / 8 
are invariant by scaling and this suggests to use Birkhoff-Khintchine's ergodic 
theorem. 

Let k > 0, res the law of the standard fBm is invariant under the transforma- 
tion Tk: 0 — p Du) , by Birkhoff-Khintchine's theorem and Lemmas 12 and 14, 
we have 

1 y i pk E o as, 
Byns / ke EN ? 
where Y is constant. "PE 
1 N-—1 


N n=0 


knti 
f, ons y 
k 


n ps N->+00 
We conclude that 
V" ds as, 
A - As AGES 
since k was arbitrary, 
1 ['dB; 
logt Ji Bs >to — 
Now, because B is transient, 
] t df, ] t dB; 
logt Jı f;  logtJo B, 





This implies 
1 [f'dB; as, 
meni A iit MES > 3 
logt Jo B, t—-4oo 
Let us now show that 


Re(Y) = H 


PLANAR FRACTIONAL BROWNIAN MOTION 171 


For this, we must check that 


logt / t+-+00 


which is a direct consequence of the scaling property for 6. We deduce that 
Re Y = H. 
Now, we note that | 
Im Y =0 


because of the rotational invariance of the fBm. 
Since the second limit is obtained exactly by the same way, we omit the details 
of its proof. Let us just mention that it is based on the following computation: 


l t ds L 3 
Foo f o (BVH) NVA 
where N denotes a random variable with standard normal distribution on R2. © 


REMARK 17. nue what happens in the case of the standard Brownian mo- 
tion, the process ( h oF) 1-0 is not a complex fBm P time-changed with the clock 


i XA 


1 ('8B, PgashB|U" as, 


Indeed, otherwise, we would have 


NT x AONB A 
logt Jo B; logt t— 4-oo 
because | 
B as o 
f t->-+00 


REMARK 18. Itis also possible to apply the ergodic theorem to obtain, more 
generally, jointly with Proposition 11, the following limit theorem: 


l 1 Bs as. 1(1 — 1/2H)) 
i quill a ta e 21/QH) [r2 


where f is an integrable function on the unit circle S endowed with its normalized 
Haar measure o and f is a two-dimensional fBm issued from 0. 


As a direct corollary of this, and by time inversion we, deduce the following 
corollary. 


172 F. BAUDOIN AND D. NUALART 


COROLLARY 19. Let (B;);>9 be a complex fBm with Hurst parameter H > 5 
and started at 0. We have, for any z € C, 


log |B log | B; 4- zt?H 
g|B: t z| 425; H and og |B; +z | 25 g 
logt | t—-oo logt t—0* 


PROOF. The first part of the proposition stems immediately from the skew- 
product decomposition, indeed, 


log | B; | l (f x) 
— — — Re 
log t logt 0 B; 


For the second part, we use a time inversion argument. Indeed, we can write 


B; = t? Bi js, 








where f is a planar fractional Brownian motion issued from 0, and then we apply 
the first part of the proposition to 6. C 


Observe that, from the law of iterated logarithm (see, e.g., [9] and [10]), it is 
known that, for any H € (0, 1), the following inequality holds: 


|B| x AG +27) /ininG +2). 


Before we conclude this section, we give a small extension of Proposition 16. 


PROPOSITION 20. Let (B;);>9 be a complex fBm with Hurst parameter 
H> 5 and started at zo #0. Let f : C — C be a holomorphic function on the 
whole plane such that f (0) = 0. Then 


f, rs yo 5 Hf'(0) 
logt Jo B, * t-+-+00 l 


PROOF. Indeed, we can write for z € C* and some holomorphic function 


g:C— C, 
1 F'O) 1 1 
(= es) 
Zz Z Z Z 
It stems from this that 


1 ff fil (0) st 6B 1 ft 1 
c (=) 8B, = f ©) = +—— | zt) 8B,. 
logt Jo B, logt Jo Bs; logt Jo B; B; 


Now, thanks to Itô’s formula, 


- [om b )-4) 
logt Jo B2? VB,J ^ logti?^ zo. OMB) I 


which allows to conclude according to Proposition 16. L] 





PLANAR FRACTIONAL BROWNIAN MOTION 173 


5. Study of the windings. In this section we shall again assume that zo 4 0. 
We denote 
t B? dB! — Bi dB? 
dh 
From the skew-product decomposition of the two-dimensional fractional Brownian 
motion, the process @ characterizes the windings of B. Though the complete study 


of the asymptotics of 0 is not yet well understood, we present some results. First, 
it is easy to study the windings of B up to 27. 


PROPOSITION 21. We have 


Ee") "T 1 


n—--oo thi ^ 





PROOF. From the skew-product decomposition, we get 
B, 
—— = exp(i6,), t> 0. 
Blo 
Hence, for n € N, 
(FN 4 zo)" ) 
It£ N + zoj” /" 


where N is a two-dimensional standard normal law. The result follows then from 
a straightforward computation on the two-dimensional standard normal law. [C 


E(exp(in0,)) = E( 


We conclude now the paper with a limit theorem for the functional of the two- 
dimensional fractional Brownian motion 


f B? dB} — B! dB? 
Z=] =p 


M ; 
(5.1) m 


which looks like the windings. Formula (2.3) allows us to compute the variance of 
this process: 


t pt 
E(Z?) = aH J J (rs)? R(s, r)r — s|^E—? dr ds 
1 J1 
(5.2) — 2a, J Um (sr) ?P 10. asslr — a|] — s|? dr ds dO da 
N 
< Cg logt. 


PROPOSITION 22. Let Z, be defined by (5.1). Then —=t= Jim converges in law, 


as t tends to infinity, to the normal distribution N (0, 07), where 


c? «4H (2H —1) [ (y? RG, D- H log y~ '- ng( -> -JJa- y gu 


174 F. BAUDOIN AND D. NUALART 


and 
1 
p(y) = J t-n et 2H gx. 
PROOF. Define, for À € R and t > 1, 
“Lt 
A — i—_= 
, exp( eet =), 


and set g(A, t) = E(X;). Using the change of variable formula for the Riemann- 
Stieltjes integral we can write, for 1 < fg <t, 


= Xp u ^" (B? dB! — Bl dB?) 


+ ls wore 
t iA 
— Jn 2u (log u)3/2 
Using (2.4), we can transform this pathwise integral into a divergence integral, 
plus a ot term: 


(5.3) 
XyZy du. 


X,u 7” (B? dB} — B} dB?) 











to -= 
t iX 
5.4 =| X,u-74 (B25Bl — Bl8B? 
( ) log u uu ( 5) 
[Du 2H pu 
egi | s Í (B2?pl x, — Bl D?X,)(u — r)? dr du. 
0 


Substituting (5.4) into (5.3) and taking expectations, we obtain 
pà, t) = 9, to) 


t y,—-2H u 
— ag? : E Í X,(2 DI Z, — BlD2Z,)(u — ry?" > dr du 
to j 
t 4 
ty 2u logu QA 





Aa u)du. 


Now we apply the duality relationship (2.2) and we get 
E(X, B2 D1 Z,) = E(Xy(D?D; Zu, 100,4) gj) 
+E((D? Xu, 10,55) 54D; Zu) 
= E(X,)(D^ D} Zu, 19,5) 5, 





+ E(Xu(D? Zu, 100,0) 4 D] Zu) 


iA 
flog u 


PLANAR FRACTIONAL BROWNIAN MOTION 


and we obtain 
P(A, t) =p, to) 





— (Dl ÖD Z, ; 1(0,u)) 4) (u — pH? dr du 


t y-2H 
— ag À* E(X,) 
to logu 
Paaa] 
« [WD Za Newly 
t à a 





- —(A At 
NOE qi vu du T WO. 


where 


t yz 
à, t) = —iX? / — —5 
V, 1) =i ay o Togu 


u 
x [EX (D?Zu, 104) Di Zu 


175 


— (D! Zu, 1(0,u)) yD? Zu) (u — ry"? dr du. 


We have, for r < u, 
B? ¢* 8B? 
y2H |. gàH^ 


u B} B! 
2 0 r 
D^Z, =| 33H — AB 


D! Z, = 


Hence, 


(DD Zu. 10,5); — 0 lpg — r ^P 10,7), 10,5], 


H pu 
=ou | | 9-742 1g — 9|? H— dé do —r ?H R(y u) 
r 


u 
=H J g^ 7B (g^H-1 a — gy? #-}) do — 1-77 R(r, u) 
r 


— Hlog < + Hf (=) — p7?H R(r, u), 
r u — r 


where 


1 . 
B6) = J (1 — x)? -l 4. y)-2H dx, 


and, similarly, 


(D^ DI Z,, 1,4), — (r ^" 10,7) — 97716, 10,4), 


— r ?H R(r, u) — Hlog- -— us( 


r 





Hu c-r 


) 


176 F. BAUDOIN AND D. NUALART 


Hence, 


pà, £) = 9, to) 
t ,4,72H 
— Jug J =) 
to logu 


x f (+ RO, u) — Hlog~ — üp(. “-)) 


(5.5) x (u —r)74-* dr du 
t À ðe 


— — (à, u)d À, t 
to 2ulogu JA MET N 








j 1 
= 90. to) — ory? | aneda 
to ulogu 


t à dag 
— —(AX,u)d A, t), 
to 2ulogu aa | a 





where 
1 
pa J (y "Ro, 1) — Hlogy"! — Hf (:3-))a — yy dy, 
! = 


We claim that lim; oo Y (À, t) = 0. In order to show this property, let us write 
E(X.(D Zu. 140,0) ge D; Zu) 


B2)  ["à8B2 
— E(X.(D?z,. 1(0,4)) e (5 u | zx )) 


= E((D?Z,, Liga) "(D X, 100,7); = (D? X,, OF Lew) se) 


i E 
= E(X,(D?Z,, 10,u)) g(r?” (D? Zu, 10,79) 3¢ 


~ / log u 





— (D*Z,, 0 ?P 1¢,u)) 3): 
Therefore, 
P(A, f) =—iVay 


t ye 
E Í Lm. 
à to log 45/2 d 
u 
x Í (D^ Zu, 19, p; 


X (r PAD Zi Lor) x ca (D?Zy, a 
+ (D! Zu; 1(0,u)) 3¢ 


PLANAR FRACTIONAL BROWNIAN MOTION 177 


x (r "(Di Zu, 19,5) — (D! Zu, 0 7" 16,45) ge) 
x (u — pH? dr du. 
We have 


:6B2 = 
r 


Hence, 
2 
P 5 Ba 
“aH 92H 2 


|D: Zu tolzel s en f n 


sc r "ir — 6|°4-2 dr d0 
0 J0 


Ir — 9127 —^ dr dO 














« Cu”, 


because, applying the inequality proved in [7], we obtain, for some constant cy, 


| [ 5 BZ 
r 92H 2 
Similarly, 


(D1z,, 9*1, yy) aan f| f (3 - [52 Jte - otto?! dodo 
H o2H 92H 


and 








« cy |o” Low luna) cear ps 





u u 
(D! Zu, 0 Etemel x C J | oH |o — 6 (24-29-24 do do 


= Cdr", 
where 
OO noo 
a=| | yt by — xl 77x ?H dx dy < oo. 
Also, 
uU rr s §B2 B? 
(Ze torzh san f [| | a- zu | lo - 9" a0 do 











H pr 
«cfl oH |o — o2 d9 do 


« Cr”. 
Finally, 
t uH u 
IW, £)] «c f —ap | p H (u — ry? 8? dr ds 
t logu3/2 Jo 
(5.6) 


C 
log 





> 


3 


178 E. BAUDOIN AND D. NUALART 


where the constant C depends on to, H and i. Similarly, we can derive the estimate 











ay 
5.7 — (à, t) € ——_. 
ud ar | ] T t(logr)?? 
Equation (5.5) yields | 
3 | 
ot tlogt  2tlogt 0A ðt 
Hence, 


2 
(hat) = eG. to)exp |- f ZEEE gs| 
0 








slogs 
t S 2g A^ p(u) à dg day 
5.8 +f oof- | AEP du) (—À— À + Sa 
=) to iad t, ulogu : 25 logs dA = as) — 
The integral exp {— I der o) ds) behaves as or as t tends to infinity. We have, 
from (5.2), l 
= 
(5) e| < eer BG XOI s Cn 


Then, (5.9) and (5.7) imply ie MM t) = g(A, co) exists. By a simi- 
lar argument, differentiating equation (5.8) with respect to A, we can show that 


Itm, «oo "e = 22 (4, co) exists. We claim that 


(5.10) —2agAp(oo)p(A,oo) — =H 0. oo) = 0. 


Otherwise, 2 3€ behaves as "zr when ¢ tends to infinity, where C is a constant 


different from zero, which is contradictory because we would have 
p(A, t) 2o € 108108 f. 
Equality (5.10) implies that 
(A, 00) = e 298^ plo) 
Zt 





and - converges in distribution as ¢ tends to infinity to the normal distribution 
N(0,4agp(oo). O 


Notice also that, as H tends to i, the variance o? in the preceding proposition 
converges to 2. This is reasonable, because if B, is a standard two-dimensional 


B? dB!—B! d B2 
Brownian motion, L > : ——i-—352—3*. converges in distribution as £ tends to 


infinity to the normal law N (0, 2). Indeed, the Girsanov theorem implies that 


ole (P f Ha Ram, p IB) 
J/logt $ 2logt Ji s? 














PLANAR FRACTIONAL BROWNIAN MOTION 179 
2 
and, applying the ergodic theorem, we have that 3i ei : A ds converges almost 
surely to one. Nevertheless, we could expect that, in the case of the {Bm which is 
transient, the functional 





! B2 dB! — Bl dB? 
j ae 


is closer to the windings of B than it is the case for the planar Brownian motion 
for which it is well known, since [18], that they behave like AG log t, where C is a 
Cauchy law with parameter 1. 


Acknowledgments. This work was initiated while the first author was visit- 
ing the IMUB from September to December 2002. He wishes to thank friendly 
D. Nualart for his hospitality, his enthusiasm and his financial support. 


REFERENCES 


[1] ALÒS, E., MAZET, O. and NUALART, D. (2001). Stochastic calculus with respect to Gaussian 
processes. Ann. Probab. 29 766-801. 

[2] ALOS, E. and NUALART, D. (2003). Stochastic calculus with respect to the fractional Brown- 
ian motion. Stochastics Stochastics Rep. 75 129—152. 

[3] BAUDOIN, F. and NUALART, D. (2003). Equivalence of Volterra processes. Stochastic Process. 
Appl. 107 327—350. 

[4] DECREUSEFOND, L. and ÜSTÜNEL, A. S. (1998). Stochastic analysis of the fractional Brown- 
ian motion. Potential Anal. 10 177—214. 

[5] GUERRA, J. and NUALART, D. (2005). The 1/H-variation of the divergence integral with 
respect to the fractional Brownian motion for H > 1/2 and fractional Bessel processes. 
Stochastic Process. Appl. 115 91—115. 

[6] Levy, P. (1965). Processus Stochastiques et Mouvement Brownien, 2nd ed. Gauthier-Villars, 
Paris. 

[7] MEMIN, M., MISHURA, Y. and VALKEILA, E. (2001). Inequalities for the moments of Wiener 
integrals with respect to fractional Brownian motion. Statist. Probab. Lett. 51 197—206. 

[8] MESSULAM, P. and YOR, M. (1982). On D. Williams "pinching method" and some applica- 
tions. J. London Math. Soc. 26 348—364. 

[9] OODAIRA, H. (1972). On Strassen's version of the law of the iterated logarithm for Gaussian 
processes. Z. Wahrsch. Verw. Gebiete 21 289-299. 

[10] OREY, J. (1972). Growth rate of certain Gaussian processes. Proc. Sixth Berkeley Symp. Math. 
Statist. Probab, 443-451. Univ. California Press, Berkeley. 

[11] PIPIRAS, V. and TAQQU, M. S. (2000). Integration questions related to fractional Brownian 
motion. Probab. Theory Related Fields 118 121—291. 

[12] PITMAN, J. and Yor, M. (1986). Asymptotic laws of planar Brownian motion. Ann. Probab, 
14 733-779. 

[13] PITMAN, J. and Yor, M. (1989). Further asymptotic laws of planar Brownian motion. Ann. 
Probab. 17 965-1011. 

[14] REVUZ, D. and YOR, M. (1999). Continuous Martingales and Brownian Motion, 3rd ed. 
Springer, Berlin. 

[15] ROGERS, L. C. G. (1997). Arbitrage with fractional Brownian motion. Math. Finance 7 
95-105. 


180 F. BAUDOIN AND D. NUALART 


[16] Russo, F. and VALLOIS, P. (1993). Forward, backward and syne stochastic integration. 
Probab. Theory Related Fields 97 403-421. 

[17] Sm, Z. (1998). Windings of Brownian motion and random walks i in the plane. Ann. Probab. 
26 112~131. 

[18] SPITZER, F. (1958). Some theorems concerning 2-dimensional Brownian motion. Trans. Amer. 
Math. Soc. 87 187-197. 

[19] TALAGRAND, M. (1995). Hausdorff measure of trajectories of multiparameter fractional 
Brownian motion. Ann. Probab. 23 767—775. 

[20] XIAO, Y. (1996). Packing measure of the sample paths of fractional Brownian motion. Trans. 
Amer. Math. Soc. 348 3193-3213. 

[21] YOUNG, L. C. (1936). An inequality of the Hólder type connected with Stieltjes integration. 
Acta Math. 67 251—282. 

[22] ZAHLE, M. (1998). Integration with respect to fractal functions and stochastic calculus I. 
Probab. Theory Related Fields 111 333—374. 


LABORATOIRE DE STATISTIQUE FACULTAT DE MATEMATIQUES 
ET PROBABILITES UNIVERSITE DE BARCELONA 

UNIVERSITE PAUL SABATIER GRAN VIA 585 

118 ROUTE DE NARBONNE 08007 BARCELONA 

31500 TOULOUSE SPAIN 

FRANCE E-MAIL: dnualart@ub.edu 


E-MAIL: fbaudoin @cict.fr 


The Annals of Probability 

2006, Vol, 34, No. 1, 181-218 

DOI: 10.1214/0091 1790500000085 12 

© Institute of Mathematical Statistics, 2006 


RANDOM GROWTH MODELS WITH POLYGONAL SHAPES! 


BY JANKO GRAVNER AND DAVID GRIFFEATH 
University of California, Davis and University of Wisconsin 


We consider discrete-time random perturbations of monotone cellular 
automata (CA) in two dimensions. Under general conditions, we prove the 
existence of half-space velocities, and then establish the validity of the Wulff 
construction for asymptotic shapes arising from finite initial seeds. Such a 
shape converges to the polygonal invariant shape of the corresponding deter- 
ministic model as the perturbation decreases. In many cases, exact stability is 
observed. That is, for small perturbations, the shapes of the deterministic and 
random processes agree exactly. We give a complete characterization of such 
cases, and show that they are prevalent among threshold growth CA with box 
neighborhood. We also design a nontrivial family of CA in which the shape 
is exactly computable for all values of its probability parameter. 


1. Introduction. Discrete local models for random growth and deposition 
have been a staple of rigorous research in probability since the Hammersley and 
Welsh paper [18] on first passage percolation about 40 years ago. Apart from their 
role as a testing ground for probabilistic techniques, a voluminous physics litera- 
ture [26, 28] testifies to their importance in understanding the evolution of natural 
systems far from equilibrium. The most basic tool, introduced in [18] and ubiqui- 
tous ever since, is subadditivity: the process dominates one restarted from an al- 
ready occupied point. Clearly, this imposes a monotonicity property on the model, 
but, as we will see, not much more. The result is the existence of an asymptotic 
shape: started from a finite seed, and scaled by time, the occupied set converges to 
a deterministic convex limit. Elegant as this method is, it is nonconstructive and as 
a result fails to provide any detailed information about the limiting set. Thus as- 
ymptotic properties of subadditive sequences are still an active area of research [2, 
34]. Are there cases when the shape can be exactly identified? Research on this 
topic has so far primarily focused on growth from infinite initial states, also known 
as random interfaces. Methods have ranged from hydrodynamic limits based on 
explicit identification of invariant measures [30, 31], to techniques arising from 
exactly solvable systems in mathematical physics [14, 20], and to perturbation ar- 
guments based on supercritical oriented percolation [6, 8] which imply that some 
interfaces move with the speed of their deterministic counterparts. For other related 
rigorous and empirical results see [16, 22, 27]. 


Received August 2003; revised April 2005. 
l Supported in part by NSF Grants DMS-02-04376 and DMS-02-04018. 
AMS 2000 subject classifications. Primary 60K35; secondary 11N25. 
Key words and phrases. Cellular automaton, growth model, asymptotic shape, exact stability. 


181 


182 J. GRAVNER AND D. GRIFFEATH 


The main aim of this paper is to extend the perturbation approach to show that 
the finite limit shape of a random growth model may also agree with that of a 
deterministic one. At issue is not merely whether small random errors induce small 
changes (we will see that this 1s always the case), but rather whether the shape 
can stay exactly the same. This property, which we call exact stability, is only 
valid under substantial assumptions, as the model has to have opposite structure, 
in an appropriate sense, from the additive one considered in [6]. In the process, we 
extend the result of [3] to obtain the Wulff characterization of the invariant shape. 
We also show that exact stability is far from rare; in fact, almost all members of 
arguably the most natural family of two-dimensional growth models, the threshold 
growth cellular automata with square neighborhood, are exactly stable. Finally, we, 
show how to employ exactly solvable systems to construct one example which has 
a computable shape for every value of its probability parameter. (Although they 
are invaluable in suggesting universal phenomena, exactly solvable examples are 
extremely difficult to come by.) 

The random rules we describe below can be thought of as discrete counterparts 
to the KPZ equation, which in turn is touted as a universal scaling model for any 
local growth and deposition process in physics [26, 28], in particular for crystal 
growth [28]. This we mention because the well-studied roughening transition in 
crystallography, whereby a crystal loses its polygonal shape as the ambient temper- 
ature increases, produces pictures which are strikingly similar to ours [32]. While 
this transition is usually thought to be an equilibrium phenomenon, the present 
results at least suggest that it may have a dynamic counterpart. 

We now proceed to precise formulations. Unfortunately, these require a large 
number of definitions related to our previous work. Although we do not use any of 
the results from [10] explicitly, a glance at that paper's first two sections may help 
to motivate what follows. 

Our basic framework consists of two-state cellular automata (CA). In general, 
such a cellular automaton is specified by the following two ingredients. The first 
is a finite neighborhood N C Z? of the origin, its translate x + M then being the 
neighborhood of point x. By convention, we assume that NV contains the origin. 
Typically, VW = B,(0, o) = {x:||x|ly x o), where || - ||, is the Z"-norm. When v = 1 
the resulting NV is called the range o Diamond neighborhood, while if v = oo it 
is referred to as the range p Box neighborhood. (In particular, range 1 Diamond 
and Box neighborhoods are also known as von Neumann and Moore neighbor- 
hoods, resp.) The second ingredient is a map 7:2” — (0, 1}, which flags the suf- 
ficient configurations for occupancy. More precisely, for a set A C Z^, we define 
J (A) C Z by adjoining every x € Z for which zr ((A; — x) NM) = 1. Then, fora 
given initial subset Ag C Z? of occupied points, we define A1, A2,... recursively 
by A;+1 = T (Aj). Accordingly, occupied and vacant sites will often be denoted by 
1’s and 0’s, respectively. Our main focus. will be starting states Ag which consist 
of a possibly large, but finite set of 1’s surrounded by 0’s. However, we will also 
consider other initial states, namely half-spaces and wedges. 


RANDOM GROWTH MODELS 183 


We restrict to two-dimensional dynamics for two main reasons. First, almost 
every step in higher dimensions introduces new technical complications, some 
quite serious. In fact, there are new phenomena, and the classification of Theo- 
rem 2 below becomes much more complex. Second, some of our techniques are 
intrinsically two-dimensional, such as the explicitly solvable example of Section 6, 
the lattice geometry and analytic number theory of Section 7, and even combinato- 
rial properties studied in [3]. Nevertheless, some results—notably Theorem 1—do 
readily generalize to arbitrary dimension. 

Our key assumption is that the CA dynamics are monotone (or attractive), that 
is, $1 C So implies z (81) < s (52). Note that specifying a monotone dynamics is 
the same as specifying an antichain of subsets of MV: the inclusion minimal sets 5 
with zr (S) = 1 having the property that none of them is a subset of another. Surpris- 
ingly, the number of possible monotone dynamics (known as a Dedekind number) 
is possible to estimate for large M. Some typical properties of monotone CA are 
also known [23]. Unfortunately, it turns out that for large box neighborhoods the 
asymptotic proportion of supercritical rules (see the definition below) is negligible. 
Other interesting properties seem to present great difficulties. In studying typical 
monotone CA rules, it is therefore desirable to restrict to a simpler class. 

A natural such class consists of totalistic monotone CA, those for which zr (S) 
depends only on the cardinality || of S. In other words, there exists a threshold 
0 > 0 such that x(S) = 0 whenever |$] < 0 and x(S) = 1 whenever |S] > 0. This 
much studied case is also known by the name threshold growth (TG) CA. 

Induced by J is a growth transformation 7 on closed subsets of R?, given by 


7 (B) = [x eR?:0e T ((B —x) n Z2)). 


In words, one translates the lattice so that x € R^ is at the origin, and applies T to 
the intersection of Euclidean set B with the translated lattice. It is easy to verify 
that the two transformations are conjugate, 


T(BNZ) =F (B) NZ. 


It will become immediately apparent why J is convenient. Let S! C R? be the set 
of unit vectors and let 


H; = {x eR^:(x,u) <0} 


be the closed half-space with outward normal u € Sl. Then there exists a w(u) ER 
so that 


T(H,)—H, +wu)-u 
and consequently 
3'(H, n Z^) «(H, -tw(u)u)n Z^. 


If w(u) 0 for every u we call the CA supercritical. A supercritical CA hence 
enlarges every half-space. This is equivalent to existence of a finite set Ag which 


184 J. GRAVNER AND D. GRIFFEATH 


fills space, that is, LJ, 29 A; = Z? [3, 10]. All initial sets will be assumed to fill 
space from now on. Set 


Kj — | J(I0, 1/w(0)] -u:u € S!) 
and let L be the polar transform of K | /,, that is, 
Tam Kw = {x e R°: (x,u) x w(u)). 


Then one can prove the following limiting shape result for any finite Ag: 


lim Ai E 

too f 
where the limit is taken in the Hausdorff metric. In short, the shape L = L(z) is 
obtained as the Wulff transform of the speed function w: S! — R, which for small 
neighborhoods is readily computable by hand or by computer. Furthermore, L is 
always a polygon and the Hausdorff distance between A; and £L is bounded in 
time ¢ [9—13, 37]. 

To formulate the stability properties of L under random perturbations, we begin 
by introducing a general monotone random dynamics. The function z differs from 
the one described above in that it has values in [0, 1]. Upon seeing a set of occupied 
sites x + S in its neighborhood at time f, a site becomes occupied at time t + 1 
independently with probability zr (S). To obtain a monotone rule we require that 
n ($1) x x ($5) whenever $1 C S5. 

More precisely, introduce i.i.d. vectors Ex z, x € Z^,1 —0,1,2,..., with 2l 
coordinates £x , (S), which are Bernoulli(zr (5)) for every $ C N. We assume that 
these are coupled so that £x (51) = 1 implies that £, ; ($2) = 1 whenever $4 C So. 
The construction of such a coupling is left as an exercise for the reader. The random 
sets A1, A5, ... are now determined by 


Arti = xx (x + NN Ai) — x) = 1]. 


To avoid some trivialities and inessential complications, we assume that 1's only 
grow by contact: m(@) = 0, and that x is symmetric: —N = N and z(—5$) = 
zx (S). Much more substantial is the assumption that x solidifies: zt (S) = 1 when- 
ever 0 e S. These three properties, together with monotonicity, will be our standing 
assumptions throughout the paper. - | 

For every random zt, we set p = min(z (S) :z(S) > 0}, define the associated 
deterministic dynamics by its map zr4(S) = 1,7(s)>0}, and label the iteration trans- 
form J as before. We will say that x is a p-perturbation of the CA J. For many 
purposes the standard p-perturbation, which has x(S) = p whenever z (S) > 0, 
suffices. 

We say that a p-perturbation of J has shape Ly if 

A, 


lim I—— ede 
t> f 


RANDOM GROWTH MODELS 185 


almost surely, in the Hausdorff metric, for every finite initial set Ag which fills 
space. We say that J has exactly stable shape L if there exists a p < 1 such that 
L4 = L (which of course subsumes the existence of the shape Ly) for the stan- 
dard, and hence any, p-perturbation zr. For a standard perturbation, we also write 
Lp = L4. Thus L1 = L. Recall that the deterministic growth at time t is included 
in a constant fattening of t L1; hence the same is true of any p-perturbation. 

As already mentioned, such considerations are in the general direction of the 
vintage Durrett-Liggett flat edge result [6]. To describe their result in our con- 
text, recall that a deterministic CA is additive if zx (S) equals 1 precisely when 
S is nonempty. In this case Kijw = * and L = co(.N). Moreover, any standard 
perturbation is a first passage percolation model, and as such has an almost sure 
(deterministic) limiting shape L, for each p > 0 [5, 29]. For the von Neumann 
neighborhood, Durrett and Liggett proved that, if p is close to 1, then Lp is close 
to L and in fact inherits from L flat edges in the four diagonal directions. However, 
they show that Lp is not equal to L, due to the fact that its extent in the coordinate 
directions is strictly less than 1. 

The existence of a limiting shape Ly for general random dynamics does not 
immediately follow from standard subadditivity arguments. A sufficient condition 
is a property of 7' we call local regularity. Namely, for every initial state Ao there 
exists a constant C so that the following is true for every fixed (deterministic) 
assignment of £; ;: every x € A, at distance at least C from Ag has an occupied set 
G C A, entirely within distance C of x such that G fills space. 

Note that local regularity is a combinatorial condition involving every possi- 
ble way A; can evolve, and thus has nothing to do with probability. At first it 
seems a condition not likely to be often satisfied, but the opposite appears true. 
One can easily check local regularity directly for many cases with small N, and 
it holds generally for box neighborhood TG CA. All known counterexamples in- 
volve "strange" neighborhoods [3]. Under this condition, it can readily be shown 
that Ly exists. 

Besides finite shapes, limiting profiles from half-spaces are of considerable in- 
terest. The first reason is that their Monte Carlo approximations can be computed 
much more efficiently (see Remark 2 in Section 8). The second is that they are 
important for shapes from other infinite sets, such as wedges and holes [13]. For 
finite seeds also, the Wulff transform (see Corollary 1.1 below), which expresses 
the asymptotic shape in terms of half-space velocities, is very handy. However, the 
limit theorem in [3] does not extend to infinite seeds, as restarting requires an a 
priori upper bound on fluctuations. Here we provide the missing step, which es- 
tablishes the following large deviations bound, referred to as the Kesten property 
in [13]. (See [21] for a similar result in the first passage context.) 


THEOREM 1. Letz be a p-perturbation of a locally regular supercritical CA 
and let the initial set be Ag = H7 NZ? for u € S. Then there exists a deterministic 


186 ; J. GRAVNER AND D. GRIFFEATH 


Wy (u) > 0 such that 
H; +t(Wr(u)— e) u C Ar C H, +t(wr(u)t+e)-u 


within the lattice ball of radius t? with probability at least 1 — exp(—cst). Here 
Ce > Ü as soon as & >Q. 


COROLLARY 1.1. Fora p-perturbation x of a locally regular CA and finite 
initial sets which fill space, 


Ay * 
P dr m K L/w? 
in the Hausdorff metric, almost surely. 


Next is a generalization of the flat edge result [6]. In particular this implies that 
Ly — Lı when p — 1, as promised. 


PROPOSITION 1.2. Given a standard p-perturbation of a locally regular CA 
and any € > Q, there exists a p < 1 close enough to 1 that Ly agrees with L| 
outside the e-neighborhood of the set of corners of L|. 


Our second theorem provides necessary and sufficient conditions for exact sta- 
bility. Before its statement, it is instructive to look at the three supercritical Moore 
TG CA. The 0 = 1 case is additive and exact stability cannot hold. (This can be 
proved by the methods of [6] or [25], but we give a different argument in Sec- 
tion 3.) For 0 = 2 one finds that Ky; » == co(.N) and hence this is a quasi-additive 
case, that is, a CA with convex K;/,. Quasi-additive CA share many properties 
with additive ones [10, 11, 13], and lack of exact stability turns out to be among 
them. Finally, in the 0 = 3 case K1;, has 16 vertices, of which three successive 
ones are (0, 1), (1, 2), (1, 1), and the remaining 13 are then continued by sym- 
metry. (This set, which the reader is invited to compute, 1s the innermost region 
of Figure 7.) Eight of these are the only points that K,,, shares with the bound- 
ary of its convex hull. In a sense, the fact that these eight vertices form a discrete 
set makes this CA as unlike a quasi-additive one as possible. This turns out to be 
precisely the condition needed for exact stability. 

Accordingly, we denote 


OK" = 0(Kiju) N 8(co(Kijw)), 
and describe the relevance of properties of this set in our main result. 
THEOREM 2. Consider a supercritical locally regular CA (which also satisfies 


our standing assumptions) given by T , with limiting shape L1, and its standard 
p-perturbation. There are three possibilities: 


Case 1. 8 K' consists of isolated points, no three of which are collinear. 


RANDOM GROWTH MODELS 187 


Then the following hold for p « 1 close enough to 1: 
(S1) Lp = Lı. 
(S2) Convergence to L is tight: for any & > 0, there exists an M so that, 
for any t and x € t L1, P(x is within M of Aj) z 1 — e. 
(S3) There exists a large C so that, with probability 1, (t — Clogr)L, N 
Z? C A; eventually. 
Case 2. 8 K' consists of isolated points, three of which are collinear. 
Then (S1) and (S3) still hold for p « 1 close enough to 1, but tightness 
(S2) no longer does. Instead, for any p < 1 there exists a c > Q so that a 
corner of tL, is eventually at distance at least c logt from Ay, a.s. 
Case 3. 0K’ includes a line segment. 
Then (S1) no longer holds. Instead, for any p < 1 there exists a c > 0 
so that a corner of t L1 is at distance at least ct from A, a.s. 


Figure 1 shows a box neighborhood TG example for each of the three cases, 
from left to right, with periodic shading of updates: Case 1 (range 1, 0 = 3, 
p = 0.9), Case 2 (range 2, 0 = 7, p = 0.95) and Case 3 (range 2, 0 = 8, p = 0.95). 

The fundamental difference between Moore 0 = 2 and 0 = 3 TG CA is their 
mistake-fixing ability, which we now illustrate. Suppose we start each case with 
a large copy of the invariant shape and remove a finite chunk of occupied sites at 
the boundary. Regardless of the location of such a hole, the 0 — 3 case eventu- 
ally repairs (or “erodes’’) it and thus the hole's effect is bounded in time. Figure 2 
provides a demonstration. This eroding property can be used to favorably com- 
pare the random dynamics on infinite wedges, determined by the corners of L1, to 
Toom rules [35]. The corners are then patched together by an oriented percolation 
comparison in the middles of the edges. In Case 2, mistakes are still fixed, but for 
wider wedges than in Case 1, and corners must be rounded off accordingly. 

By contrast, the 8 = 2 TG CA can only repair holes away from the corners, 
while those at the corners have a lasting effect, as also seen in Figure 2. In a random 
dynamics, such mistakes pile up and induce a linear slowdown. 





FiG. 1. The three cases of Theorem 2. 


188 J. GRAVNER AND D. GRIFFEATH 





FIG. 2. Error correcting for 0 — 2 and 0 —3. 


Given the exact stability criterion of Theorem 2, it is natural to ask whether a 
typical supercritical CA has an exactly stable shape or not. As already mentioned, 
properties of typical monotone CA seem difficult to characterize. We will thus re- 
strict our attention to a special family, TG CA with range p box neighborhoods Wp. 
These are supercritical for 0 < o(2p + 1) [10]. The smallest examples are already 
illuminative. As M; has already been discussed, Wz is next in line and turns out to 
have 0 = 1,2,3,5,8 in Case 3, 0 = 7,9, 10 in Case 2, and 0 = 4, 6 in Case 1. For 
very large ranges, 0's in Case 3 form a small minority, as the following theorem 
demonstrates. 


THEOREM 3. Fix an arbitrary € > 0. Among all supercritical range p box 
neighborhood TG CA, the proportion of those which are not exactly stable 
is for large p between 1/log"** p and 1/log" p. Here h = 2(1 — 1/log2 — 
log log 2/ log 2) + 0.172. 


The proof of Theorem 3 connects the number of 0's which lack exact stability 
to the number of distinct products of pairs of natural numbers between 1 and p. 
This latter is known as the Linnik-Vinogradov-Erdós problem, for which sharp 
asymptotic bounds were given by Hall and Tenenbaum [17]. We have no result 
on the division between Cases 1 and 2, but conjecture that Case 2 1s much more 
prevalent. 

The rest of the paper is organized as follows. Section 2 contains the proof of a 
slightly weaker version of Theorem 1 and its Corollary 1.1. Section 3 deals with 
Case 3, while Section 4 lays the geometric groundwork for the remaining cases and 
proves Proposition 1.3. In Section 5 we introduce Toom's method and complete 
the proof of Theorem 2. Section 6 is devoted to a single example for which we can 


RANDOM GROWTH MODELS 189 


compute the shape for all values of the probability parameter p. In Section 7 we 
take a closer look at the collection of K1/,'s for fixed-range box neighborhoods, 
an analysis which culminates with the proof of Theorem 3. Finally, in Section 8 
we finish the proof of Theorem 1 and discuss other related issues in lesser detail. 


2. Proof of Theorem 1. Recall that Theorem 1 deals with supercritical locally 
regular CA and their p-perturbations. These will be our context throughout this 
section. We will allow all constants C and c to depend on J and p in addition to 
their explicitly stated dependencies. (We emphasize that these constants will not, 
however, depend on the direction u.) In this section we only obtain a lower bound 
of the form 1 — exp(—cst/ log? t) on the probability of the event in Theorem 1. 

Many times below we will restart the random dynamics at a deterministic time 
or a random stopping time t. This simply means that only £, , with £ > t are used, 
with an initial state at time t which will be specified. 


LEMMA 2.1. Assume that a finite Ag > 0 fills the plane. Assume that x is at 
distance n from the origin. Then there exist constants c, C > 0 (depending on Ao) 
so that P(x € T*(Ag)) x e^** fork > Cn. 


PROOF. Call x surrounded at time t if x + Ao C Ar. By supercriticality, 
there exists a time fp at which +e] and -te» are all surrounded in the deter- 
ministic dynamics. Let Co = |T" (Ao)!. If po = pe, then +e; and +e? are all 
surrounded at time tọ with probability at least po. Take a shortest lattice path 
fo :0 = xo, X1, ..., Xn =x. We now define i.1.d. geometric( po) random variables 
Ti,..., T, as follows. Run the dynamics for time fg. If x, is surrounded at this time, 
Tı = 1, otherwise restart the dynamics with Ao at time tọ. Now run the restarted 
dynamics for time tọ; if it surrounds x, at this time, T1 = 2, otherwise restart again 
with Ag, and so on. In general, on the event (T; = k}, 7} is the minimal £ > 1 for 
which the dynamics restarted at time k + (£ — 1)to with x; + Ao surrounds x;41 at 
time fo. 

By monotonicity and exponential Chebyshev, 


P(x € T "*(Ao)) < P(T +-+- + T, > k) Ee Exp. TY", 
for any A > 0. To conclude the proof, choose A small enough that E(exp(A x 
T1) «oo. LI] 


Note that this lemma implies that wy (u), if it indeed exists, is bounded away 
from 0 uniformly in u, for any p > 0. 


LEMMA 2.2. Assume that |3 (Ao) \ Ao| — n, and start the p-perturbation 
from the same initial set Ag. If x = inf(t:3 (Ag) C Ai], then E(t) < p^! x 
(log n + 3). 


190 J. GRAVNER AND D. GRIFFEATH 


PROOF. Note that all such sites attempt to get occupied simultaneously, each 
of them at each time with probability p. Hence t is geometric(p) for n = 1. 
For n > 2, write a = —log(1 — p) and divide the sum below into terms with 
k «a llogn and with k > a^! logn to obtain 


E(t) =X (1— (10— 6725) 


k=0 


OO 
<a" logn +14 y '(1—(1—e*n7ly) 
i0 


OO 
<a" logn+1— J °n-log(1—e~“n7) 
EU 


OO 
«a logn+1+ ye 
i=0 


=a 'logn+1+2(1—e7%)1, 


and p=l—-e% <a. O 


PROOF OF THEOREM 1 WITH WEAKER PROBABILITY ESTIMATE. Without 
loss of generality, we can assume that u lies on or above y = |x|, that is, (u, e2) > 
1//2. Let F; =o {Ex sis <t—1,x €Z*},t=1,2,.... 

Let T, be the first time (0, n) becomes occupied started from H, and set T, = 
T, ^ Cn. By Lemma 2.1, P(T, Z Tp) < e~™, for a large enough C and some 
c > 0. The crucial step is this L° bound: 


(2.1) E(T5] 541) — EaI Fo) | < C'logn, 


for any s x Cn and some constant C'. 

Recall that 7, is a deterministic function of £y; where (x,t) ranges over all 
space-time sites. As W is finite, Tn depends only on a small subset of these 
variables. To be more precise, let £, comprise the sites (x,t) for which 7, de- 
pends on &, z. Then |.£,] < Cn? and we can assume that the filtration ignores 
.all other sites. At time s < Cn, let 0A, consist of all the sites outside A, which 
would become occupied if the deterministic dynamics were applied to Aş. Triv- 
lally, |9 As] < |La]. 

Restart the dynamics at time s + 1 with Aş. Let v, be the waiting time after 
this at which all sites in 9A, are occupied, that is, x, = inf{k: 0A; C As+1+k}. By 
Lemma 2.2, E (ts! F) x C” logn. 

We now prove (2.1). We will repeatedly use the strong Markov property and 
monotonicity of the dynamics. To get the lower bound in (2.1), assume the worst 


RANDOM GROWTH MODELS 191 


case: no sites outside A; (i.e., in 94;,) get occupied, and therefore the dynamics 
faces an unchanged situation at time s + 1. Therefore, 


E(T,| £541) € E(T412;) + 1. 


For the upper bound, assume that 5 ;.,.; reveals that all sites in 0A; get occupied. 
Before we know 5,1, we can only assume this happens after time Ts, and so the 
dynamics with the additional information is dominated by the one restarted at time 
S + Ts from the occupied set As U 9 Aş. It follows that 


E(T,|Fs) € E(Ta|Fs41) + E(ts|Fs) < E(T4) 2,4) + C logan. 


This proves (2.1). E 
Now let a, = E (Tn), a, = E (Tha). By (2.1) and Azuma's inequality [19, 33], 


(2.2) P (T, — Gn| > s) < 2exp(—cs?/ (nlog? n)). 
However, 
las — n| < E (TaLi, cn). 
which is bounded by Lemma 2.1. From this it follows that 
P(|Tn — an| > 8) < PTa — ān| > 8/2 — C) + P (Tp — Ta > 5/2), 
and after another application of Lemma 2.1 and suitable redefinition of c, 
(2.3) P(|T4, — an| > s) < 2exp(—cs?/ (n log? n)-ce^*. 


For an integer i, let y; be the largest j for which (i, j) € H7. Then let T; 
be the first time at which all sites in B’ = ((i, yj + n):|i| < n^) are occupied. 
Moreover, let T” be the first time at which all the sites B" = ((i, j) : i| < n2, y; + 
n—C < j € y; +n} are occupied, where C is the diameter of the neighborhood N. 
Restart the dynamics at time T? with the occupied set at this time. Note that local 
regularity implies that within a constant time the deterministic dynamics occupies 
a large ball within a constant distance of any occupied point. By monotonicity, the 
deterministic dynamics would occupy B" in tı additional time steps, where 7; is a 
constant which only depends on F. Applying Lemma 2.2 t1 times one thus obtains 


E(T, — T,) x Clogn. 


Furthermore, let T; (i) be the time the dynamics reaches (i,n + yi) and let 
T, (i) be the first time (i, n + yj) becomes occupied from the modified initial set 
yi€2 + H, . The reason for this convoluted condition is that T; (7) with the same n 
are identically distributed, but this is not true for 77 (i). 

To deal with different starting sets for 7, (i), let S, be the time the random 
dynamics fills H; N B(0, n?) from —ej + H, (which is contained in all starting 
sets). By a similar argument as in the previous paragraph 


E(S,) x Clogn. 


192 J. GRAVNER AND D. GRIFFEATH 
Therefore, with a; (i) = E(T, (i)), 
0 xa, (i) — a, x E(S,) x Clogn. 


Furthermore, the argument leading to (2.3) can be carried out with T, replaced by 
T; (i) and a, by a; (i). 
Therefore, for s > Clogn, 


P(Ta —T, 25) < PIT, — an| > 8/4) + $, P(IT;G) — as) > 5/4) 
| i| xn? 
< Cn? exp(—cs^/nlog? n) + Cn?e 5. 


It follows that 


oo 
E(T, — T)) < C /nlog? n +f 
C./n 


n log? n 


P (T, — T} >s)ds, 


which after a short computation implies that E(T, — Ti) < C./n log? n. 

We are almost done, but need an estimate for yet another approximation to Ta. 
Let T,” be the first occupation time of (0, n) started from B" — ne» = ((i, j): 
li] <n’, y; - C <j € yi}. Then, for 0 < k < Cn, 


P(T" — Ta > k) < P(T” Z T, < P(T, > Cn) xe, 
while for k > Cn, 
P(T” — Ta > k) < P(T! =k) xe * 


by Lemma 2.1. Hence E(T;" — Ta) is bounded above by a constant. 
Now assume that 0 < m <n. Restarting the growth process at time Tp, we get 


min < am + an + E(T, — Ty) + ETH — Tm) S am + an + CA/n log? n. 


By the deBruijn-Erdós subadditive theorem [33], a, /n converges to a finite pos- 
itive number a, which of course depends on p and u. We declare w,(u) = 


(u, e2)/a. 
To finish the proof, take first an (i, j) outside H7 + twy,(u)(1 + £) : u. Let 
n= j — yi > twg(u)(1-F6)/(u, e2) —t(1-- £)/a. Then 


P((i, j) € Aj) = P(T/( <t) 
< P(T,G) € na/(1 4- &)) 
< P(IT,G) — a, G)| z nae/2) 
« exp(—cn/ log? n), 


for a large enough n. This proves the weaker version of the upper bound in Theo- 
rem 1. The lower bound is proved similarly. D 


RANDOM GROWTH MODELS 193 


Several remarks are in order. First, note that the proof avoids the subadditive 
 ergodic theorem altogether, by combining properties of subadditive sequences with 
large deviation estimates. 

Second, it is in fact possible, by the same methods, to obtain a superadditive 
relation for a, of the same order, namely, 


A closer look at the proof of the deBruijn—Erdés theorem (from [33]) then gives a 
rate of convergence for an: |a; — a| = O (log? n/ A/n), which can be used to show 
that, within a lattice ball of radius t?, A, is a.s, between (t C t log? t) - wa (u)- 
u + H,- 

Third, the proof uses supercriticality and regularity only to “fill in.’ For any 
monotone, local, interface solidification with automatic coherence the proof re- 
mains valid. While we will not attempt to precisely define the concept, automatic 
coherence certainly holds when the interface moves upward (i.e., u = e2) and the 
growth is such that an empty site can never have an occupied site directly above 
it. Perhaps the simplest example is the random dynamics in which a site becomes 
occupied for sure with two or more occupied neighbors in its von Neumann neigh- 
borhood and.with probability p with an occupied site directly below. Another class 
of examples are the K -exclusion processes [31]. For some of these examples, the 
fluctuation estimates mentioned above may be new. 

Finally, and curiously, there seems no way to make the proof work for general 
monotone dynamics which do not solidify. Such cases thus remain an intriguing 
challenge. 


LEMMA 2.3. Fix an a > Q and & > Q. Then there exist constants c, C so 
that the following holds. Start the dynamics from Ag consisting of sites inside 
(H; \(—Cu + H; ) N B(0, Cn). Then, A, includes all sites inside B = (Hy + 
nwy (u) (1 — eu) V H7) N B(0, Cn) with probability at least | — e~"/ 8°”. 


PROOF. Let T'(x) [resp. T (x)] be the first occupation time of x € B started 
from the stated Ag (resp. from H, ). By Lemma 2.1, P(sup,eg T(x) > Cn) x 
e ^. However, by a “speed of light" argument, on (sup, eg T (x) < Cn} the equal- 
ity T (x) = T'(x) holds for all x € B. The claim now follows from Theorem 1. 

L] 


LEMMA 2.4. The function ws : S! — R is continuous. 


PROOF. Again assume that (u, e2) > 1/4/2. For a fixed large C and small 
e —-0, H, —Ctee; C HT C Hy + Ctees, within the lattice ball of radius Ct, 
provided ||u — vll < ¢/2. Let E(k, t) be the event that all sites on the y-axis up 
to k are occupied at time ¢ started from A, . 

By Lemma 2.3 and Theorem 1, both the events E,((1 — Ce)w, (u)t/ (v, e2), t) 
and E,((1-r £)wz (v)t/ (v, e2), t)* happen with probability (very) close to 1. This 


194 J. GRAVNER AND D. GRIFFEATH 


is only possible if (1 — Ce)w; (u) x (1--6)w; (v). An analogous reverse inequality 
is proved similarly. (1 


PROOF OF COROLLARY 1.1. Ane > 0 will be fixed throughout this proof. 
For any direction u, Theorem 1 implies that with probability exponentially close 
to 1 


A; C H7 -- twg (u)(1 +e) -u. 
It follows that with probability 1 

A 

= CH, ws) +e)-4, 


eventually. This is therefore true simultaneously for any finite collection of u’s and 


then by Lemma 2.4, 
A 
Cp.) +8)" u= (1 +8)? K? ju, 
ues} 
eventually. 


For the lower Soung, take a bounded, strictly convex, C? set K; D K] jwr (1+8). 
Then Ls = K? is C? and has for small Propa ô > 0 the property described in the 
next paragraph. 

Start with Ao consisting of sites inside US Take k — n?/8 Euclidean points 
X0, -..,Xk—1 On the boundary of nZ,, chosen so their directions are equidistant 
vectors in $1, and let uo, ..., uz be the outside normals to nL, at the chosen 
points. The enlarged set 

k—1 
Ln (8) = ( xi + (1 — 8) /nws (uui + Hy, 
i=0 
includes (n + /n )L;. 

Now run the random dynamics from Ao for ./n time steps. Since Lẹ has C? 
boundary, we need to go just a constant distance inside to “see” the relevant por- 
tion of H; . To be more precise, ((—Cu; + Hg) V C-2Cu; + H,)) O B, Cn) 
is included in nLe, for all i. By Lemma 2.3, with probability at least 1 — 
exp(—c./n/ log? n), all the sites in (n + y/n )Le become occupied. 

Repeat the above procedure (running the random dynamics for ./n time steps) 
3. /n times. As a result, (n+ j./n)Le C Any; fq for j = 1, ...,3./n Gin particular, 
4nLe C A4), with probability at least p, = 1 — exp(—c4/n/ log? n). 

Now fix an a < 1 and find a large ko so that | |, .52% Pn > a. Let To be the first 
time 27 L, c A,. By what we proved so far, 


P(Q% + j2*) L, C Amy imc, for j =0,...,3-2*,k=1,2,...) >a. 


We thus have a strictly increasing sequence of integers b, with bm+1 — b, = 
o(bm), such that 


RANDOM GROWTH MODELS 195 


eventually, with probability at least a, thus a.s., as a was arbitrary. For any 1 be- 
tween bm and bmi, 


t(1 — eL; C ba (1 — e)Ls C Ap, C Ar 
eventually, finishing the proof of the lower bound. LJ 


3. Lack of exact stability in Case 3. Fix a u € 5l. Let £, be the boundary 
line of —w(u) -u + Hy . Note that w(u) is the largest number A > 0 for which 
m ((—h-u-- H,)O MW) 1. Therefore, N N £, must contain at least one site. 

In general, for any line £ in the plane which does not go through the origin, let its 
open (resp. closed) lower cut £°(£) [resp. L7 (£)] be the set of points in M which 
lie in the open (resp. closed) half-space of £^ which does not contain the origin. 
We emphasize here (as this convention will be used extensively) that the points 
in £°(£) will be called below the line £, and that left and right on the line are from 
the perspective of an observer who stands on £ and looks toward the origin. 

We will make good use of duality between lines in Ki/w and points of M in 
the sequel. The next lemma is our first example of this duality. To illustrate its 
statement (as well as the introduced terminology), let us consider an example. 
Assume that we are dealing with a TG CA and fix a direction u. Suppose also 
that £, contains an x, € N such that a line £ obtained by a small rotation of £,, 
around x, has exactly 0 — 1 sites in £°(£). Note that for a sufficiently small rotation 
no other site but x, is in £N N. (An example for the range 2 Box TG CA with 9 = 8 
is depicted on the right-hand side of Figure 3.) Therefore, for v close enough to u, 


OKi/3 





FIG, 3. Illustration of Lemma 3.1 and its proof. 


196 J. GRAVNER AND D. GRIFFEATH 


£y is obtained by rotation of £, around x,. A little geometric argument involving 
polar coordinates then shows that the boundary of K1/, must be flat at u/w(u). 
The lemma makes a stronger and more general statement and is illustrated by the 
left-hand side of Figure 3. 


LEMMA 3.1. The following are equivalent for a u € S'. 


(1) There exists a line through u/w(u) which in a small neighborhood of u/w(u) 
lies in K17y. 

(2) There exists a point x, € £, O N so that if £ is a line through x, and is a 
rotation of £, by a small enough angle, n(£°(£)) = 0. 


In case 0 K jy is locally a line at u/ w(u), xy in (2) is unique. In fact, the smaller 
angle between 0K 1/y and u/w(u) is the same as the smaller angle between the 
vector xy, and £,. 


PROOF. Note that a short line segment through u/w(u) perpendicular to the 
vector uo € S} is given in polar coordinates (with the angle represented by a unit 
vector v) by the collection of vectors 

[c v ae ull <a 
w(u) (v, uo) ) 





for a small a > 0. 

Assume first that the statement (2) holds. Let uo = —x,,/||x,,|| be the unit vector 
pointing from x, to the origin. Then (2) says that for v close enough to u, w(v) < 
w(u)(v, uo)/ (u, uo). It follows that 
1 1 (uuo) 

w(v)  w(u) (v, uo) 
The polar representation of a line mentioned above immediately demonstrates the 
implication (2) — (1). 

To prove the reverse implication, note that (1) implies that (3.1) holds for 
some ug, and let x, = —w(u)uo/ (u, uo). Then x, has the properties required of x, 
except it may not lie in M. However, we can let x, be the closest site in £y (Y N. 
(We can in fact go in either direction from x,,.) The fact that M is discrete ensures 
that a parallel translation from x; to x, of any line £ close to £, does not pass 
through any site of M. Thus (2) is satisfied. 

To prove the last statement, note that two different x, would, by (3.1), produce 
two distinct open line segments, which would meet at u/w(u) and which would 
both be included K1/,. But then a flat portion of dK 1/;y near u/w(u) would be 
impossible. [D 








(3.1) 


LEMMA 3.2. Fixau € S! which satisfies the condition of Lemma 3.1 and pick 
a corresponding xy. For v close enough to u, the concave wedge Q = H, UH, 
satisfies $ (Q) C —x, + Q. When 3K; /w is locally a line at u/w(u), Q is invari- 
ant: T (Q) = —xy, + Q. 


RANDOM GROWTH MODELS 197 


PROOF. The first part follows from Lemma 3.1: a point in T (Q)Y(—2x,-- Q) 
would imply that a point on the boundary of —x, + Q sees a sufficient con- 
figuration in the interior of Q, but clearly —x, is in the most advantageous 
position for this. This would translate, for £ as in (2) of Lemma 2.1, into 
7zt (429 (L) U £°(£,)) = 1, but if the rotation is sufficiently small (by discreteness 
of N) (L (£) U L? (£,)) A N = L° (ON N, a contradiction. 

The second part also follows because in this case x (J^ (£)) = 1. For, other- 
wise x, could be moved to the next point in £, N N for which zr (4^ (£)) = 1. 
[Again, such a point must exist or else w(u) could be decreased.] This would con- 
tradict uniqueness. L 


When u satisfies the assumption of Lemma 3.2 there exists an invariant wedge 
of the following form: 


Q' =(—Mv + H,.) U (—Mv2 + Ho) UH,, 


where vı and v» are close to, but on different sides of, u and M is large enough. 
In particular, a hole of shape Q’ dug into H7 may be translated by the dynamics, 
but is never filled. If the creation of such holes is random they pile up and, as 
we will demonstrate by the comparison process we now introduce, slow down the 
interface. 

The following randomly growing surface will be useful here and in Section 8. 
At every time t = 0, 1,2,... a site x € Z has a height n; (x) € Z+, with no = 0. 
We will use two versions, which we call fast and slow, of the rule for increase in 
heights. Let b(x, t) be Bernoulli random variables with P (b(x, t) = 1) = p'. The 
slow version evolves according to the following rule: 


ni(x)+1,  ifb(x,t)—1and 
n1) = ny) = n, (x) for all y with |y — x| <1, 
7: (x), otherwise, 
while the fast version updates as follows: 
n(x) +1, if b(x,t) = 1 or 
41%) = m) > n (x) for some y with |y — x| <1, 
nix), otherwise. 


Note that the reverse dynamics, n(x) = t — 9; (x) changes the version and re- 
places p' by 1 — p’. We will assume that b(x, t) are not necessarily independent, 
but have finite range dependence in space: if either ti Æ t2 or |x, — x2] > r, then 
b(x1, t1) and b(x2, t2) are independent. 


LEMMA 3.3. 
(1) For the slow version: Given any p! > 0, there exist an a > 0 and c > 0 so that 


P(n(x) xat) xe *. 


198 J. GRAVNER AND D. GRIFFEATH 


(2) For the slow version: Given any e > 0, there exist a large enough p' and a 
c > Q so that 


P(n,(x) <(1—e)t) se. 


(3) For the fast version: Given any & > 0, there exist a small enough p' and a 
c > 0 so that 


P(n (x) = et) se. 


PROOF. The proof of (1) and (2) is a last passage percolation argument. 
By [24] we can in fact assume that the random variables b(x, t) are independent. 
Once the neighborhood condition (n; (y) > 5, (x) for all y with |y — x| x 1) is sat- 
isfied, a site x has to wait a geometric( p^) number of time steps before it increases. 
Accordingly, let g(x, s) be i.i.d. geometric with success probability p’. By a sim- 
ple inductive argument, it follows that the first time T4,(x) when gy; (x) =n > 1 
equals 


n-—1 

many eG Das = X, Xi+1 E€ [x — 1, x, x; +1} forO<i<n— i 
i=0 

Hence 


P (n(x) <n) = P (Ta Œ) > s) x3"P («o i) > J 
i=0 
By an elementary large deviation computation, we get, for a fixed p’ > 0 and 
a small enough a > 0, P(n,(x) < as) x exp(—cs), which implies (1). Another 
large deviation computation gives P(n,(x) < (1 — e)s) < exp(—cs) for a fixed 
€ > O and p' close enough to 1. Finally, (3) follows from (2) by reversal. O 


PROOF OF THEOREM 2 IN CASE 3. Let u € S! be a direction of an interior 
point in a line segment of 8 K'. Then u/w(u) belongs to the interior of a line seg- 
ment of 0 K1/w (satisfying the condition of Lemma 3.1) and the corner of L which 
corresponds to the edge of co(K1/w) containing u/w(u) moves, in the determinis- 
tic case, with speed w(u) in direction u. The following claim will therefore finish 
the proof. Start the random dynamics from Ag = Hy N Z4 . Then, for some a > 0, 


(3.2) P(A; C (1—a)wtu)tu+ H7) z1— gt 


For simplicity, rotate the space so that u = e2. Recall that 2M is the width of 
the bottom edge of Q’. Assign 7; (i) = 0 if Ay N (iMe; + 0) = Ø, and 7 (i) = 1 
otherwise. In general, let 3, (/) be the smallest k for which A; N Mei — kx, + 
Q' ) = Ø. It is clear that 7j; is for M large enough dominated by n: = t — n, where 
5, is ra slow version from Lemma 3.3 and p’ = (1 — p)*, for some appropriately 
large k. The range of dependence r depends on M and angles between vı and u 
and v) and u, but is clearly finite. Therefore, (3.2) follows from Lemma 3.3(1). 

E 


RANDOM GROWTH MODELS 199 


4. Flat edges of shapes for p close to 1. The setup is the same as in the 
previous section. In Lemma 4.1 below, the direction of a rotation of a line £ is 
determined by the direction of motion of the outward normal to the half-space 
in £^ which does not contain the origin. 

The left-hand side of Figure 4 depicts a general situation in the statement and 
proof of the lemma, while the right-hand side again presents a TG CA example. 
This time the range 2 Box case has 0 = 7 (see Figure 6). Note that if £ is a rotation 
of £,, around x. by a small negative angle, then there are exactly 0 — ] — 6 sites 
below £. The same is true for rotations around x; by a small positive angle. This 
translates to two line segments on the boundary of Ki/w which meet at u/w(u) at 
a convex angle, of 45? in this case. 


LEMMA 4.1. Forau € S, assume that near u/w(u) the boundary of K; Jw 
consists of two lines at interior angle below x. Then there exist x£, x" € lu N N 
with the following properties. If £ is a small rotation of £, either through x£ by 
a negative angle, or through x}, by a positive angle, then m(L£7 (£)) = 1. In fact, 
the smaller angles between ð K1j,, and u/w(u) are the same as the smaller angles 
between x£ and £, and between x" and £,, if x£ and x" are chosen to be furthest 
apart. 


PROOF. The argument is very similar to the one for Lemma 3.1. The lo- 
cal equation for 9 K;,, to the right of u/w(u) is given by v b> (v/w(u)) - 





Fic. 4. Illustration of Lemma 4.1. 


200 J. GRAVNER AND D. GRIFFEATH 


((u, ug)/ (v, ug)), for a suitably chosen uo. Then 

1 E 1 (u,up) 
w(v) w(u) (v, uo) 
if v is to the left of u. If x, = —w(wu)uo/(u, uo), this means that any small rota- 
tion £ of l, around x, in the positive direction has m(£°(£)) = 1. Now simply 
move x; rightward on £,, to the first point on £y N N for which zt (.£? (£)) = 0 for 
small positive rotations £. Such a point must exist, or else x (47? (£,)) = 1, which 
contradicts the definition of w(u). This defines x7, which must be in V N £, as 
small rotations contain no other sites of M. The definition of x£ is similar. O 


(4.1) | 











LEMMA 4.2. Fixu € S', and xx, as in Lemma 4.1, chosen as far apart 
as possible. If vı is a small positive rotation of u, then the convex wedge 
Wt= An H, is invariant: T(W*)— x W*. Similarly, if vz is a small 
negative rotation of u, then the convex wedge W^ = H; N H,, is invariant: 
T (W7) = -xf + W7. 


PROOF. This proof is completely analogous to the one for Lemma 3.2. We 
omit the details. L] 


Note that the two wedges from Lemma 4.2 are moving toward each other. In 
particular, if we now dig any finite hole in H% , it gets filled, as we state more 
precisely in the next corollary. 


COROLLARY 4.3. Ifv and v2 are as above, and 
Ao = (H; N (-Mvi + H,,)) U(H, n (—Mv2 + H,,)). 
then 
T (Ao) 2 tw(u)u + H7 
fort> CM. 

Assume now that u1, u2 € S! are such that u1/w(u1) and u2/w(u2) are on the 
boundary of co(K/,), and that u2 is a positive (counterclockwise) rotation of u, 
(by the smaller angle between them). Assume also that for small positive (resp. 
negative) rotations v of uz (resp. u1), v/w(v) is not on the boundary of co(Kj/y). 
This assumption always holds in Cases 2 and 3 of Theorem 2 when u1/w(u1) and 


u2/w(u2) are vertices of co(K,/,), and also holds for other vectors in Case 2. 
Then 


(w(ui)ui + H,.) f (w(u2)u T H,,) =z + Hya N H,, 


covers a vertex of L [and when u;/w(u1) and u2/w(u2) are vertices of co(K1/w) 
also portions of corresponding edges of L]. The equation above defines the vector 


RANDOM GROWTH MODELS 201 


z = z(u1,u2). While this wedge is by itself not necessarily invariant (although 
for small M it often is), a bounded perturbation with a suitably rounded corner is 
superinvariant [as in (3) of Lemma 4.5 below]. 


LEMMA 4.4. Let uj and u» be as above. Apply Lemma 4.2 to u = uy to get 
the corresponding W ^ and x The corner of W^ moves slower than H,, , that is, 


C-xL, u2) < w(u2). 


PROOF. The conclusion is equivalent to x ; € £~ (Eu). But this follows be- 
cause x; Y E? (£7) for any w' between u; and u2. [If K1;,, were a straight line 
between i u,/w(u,) and u2/ w(u2), x,, would belong to £j; for any such u t1" 33 


An analogous version of Lemma 4.4 of course also holds for w = u2 and the 
corresponding WT. 


LEMMA 4.5. Assume u and uz are as in Lemma 4.4. There exists a convex 
wedge W,, y, which is included in, and outside a bounded neighborhood of the 


corner equal to, Hy, N Hj, such that'the following properties hold. 


(1) When the part » the edge of cO(K1/w) between ui/w(u1) and u2/w(u2) is 
completely included in Kj jw, 


T (War,uz) = z + War us 


(2) When the open part of the edge of co(K1,4) between u1/w(u1) and u2/w(ua) 
has empty intersection with K1jw, 


T (Wa, uj) D z + ((Wu,,u; + B200, 2) N Wu uj); 


for some a > 0. E 
(3) In every other case, T (W,, us) D z + Wu, u;. 


PROOF. The condition in (2) implies that every vector v strictly between u1 
and uz has x, € L° (£v) and xt, € £°(£,). This, together with Lemma 4.2, readily 
proves (2): one simply makes W,, ,,. start with and end with wedges considered 
there, and connects them with a convex curve of small enough curvature. 

The condition of (1) implies that every vector v strictly between u; and uz 
has x, € £, and xL, € £4. Again, Lemma 4.2 implies that there are a succession 
of invariant wedges which connect the rays with normals uv; and u2. The final 
statement follows by a subdivision of the edge of co(K1/,) into subintervals of 
types considered in (1) or (2). C 


COROLLARY 4.6. Fix an € > 0. There exists a convex set Le C L, which 
agrees with L outside the £-neighborhood of its corners, so that for large 
enough M, 


F(M - Le) D (M 4 DL. 


202 J. GRAVNER AND D. GRIFFEATH 


PROOF. The wedges from Lemma 4.5 can readily be combined to approx- 
imate an arbitrarily large multiple of L, within an error confined to a constant 
distance from its corners. C 


Arguably, oriented percolation is the most useful comparison model in ran- 
dom spatial processes. We now introduce the version we will use. While this is in 
fact a random perturbation of a one-dimensional CA, it is, as the name suggests, 
best to think about it as a random occupied set which tries to establish long-range 
connections. Sites (m,n) € Z4 x Z+ are either occupied or empty (n is often 
referred to as a level). The basic ingredients are Bernoulli( p^) random variables 
b(m, n), m,n > 1, such that b(m|, n1) and b(m», n2) are independent whenever 
Imi — ni| >r or nı x na. (It is important that r does not depend on p’.) Pre- 
scribe some occupied set in Z+ x {0}. For m > 0 and n > 1, (m, n) is occupied 
if b(m, n) = 1 and at least one of its neighbors (m,n — 1) and (m — 1, n — 1) is 
occupied. 


LEMMA 4.7. Fixanya € (0, 1). Also fixa large integer M and let (0, M] x {0} 
be occupied. If p' is close enough to 1, then for large enough C = C(p/) the 
probability of the following two events converges to 1 as M — oo. 


(1) Any (m,n) with an x m < (1 —o)n and n =0, 1,... is within distance C logn 
of an occupied point. | 

(2) For every n there is a connection (through neighbors) of occupied points from 
level n down to level O which stays entirely in ((m,n) : n > 1, an x m X an + 
M 4- Clogn]. 


PROOF. These are standard applications of contour arguments (see, e.g., [5], 
so we omit the details. |] 


PROOF OF PROPOSITION 1.2. If yj, i =0,..., R — I, are vectors pointing to 
the R successive corners of L in a counterclockwise order, then 


R-1 
(M +1)Le = J Qi + Le). 
iz 
We will now concentrate on the edge between yo and y1. Start the random dynam- 
ics from Ap = M - Lẹ MZ’. Say that (m, n) € Z x Z4 is occupied if all the lattice 
sites in (n — m)yo + my, + ML, are at time n included in Ap. — 

If (m, n) is occupied, then both (m, n + 1) and (m + 1, n + 1) are occupied with 
probability at least p’ which is a power of p given by the number of lattice points 
in (M+ 1)Le \ ML. Moreover, given any configuration of occupied points on 
level n, points on level n + 1 which are r apart are occupied independently. Here 
r is any integer such that (ryo + (M + 1)Lg) Y (ry1 + (M + D) Le) = Ø. 


RANDOM GROWTH MODELS 203 


The occupied points hence form an oriented percolation with a finite range of 
dependence. For p' close enough to 1, Lemma 4.7(1) shows that the conclusion 
holds with probability which converges to 1 as M — oo. However, this is enough 
as every set is covered in almost surely finite time. L 


5. Exact stability in Cases 2 and 3. We continue with the setup of the last 
two sections. Let us begin with a statement of Toom's theorem. We will only state 
the version we need here (for which the proof is in [35]), although the conclusion 
holds in considerably greater generality [4, 36]. A two-dimensional Toom rule is a 
deterministic CA Fr given by a map xr with the following property: 


(T) There exists a line £y which does not go through the origin such that 
a(S) = 1 if and only if either L° (£r) C S or VN \ £7 (£1) CS. 


Now introduce space-time error sites, those sites (x,t) for which b(x, t) = 0. 
Here b(x,t) € {0, 1}, x € Z*,t=1,2..., are assigned before the dynamics starts. 
The state of the Zoom rule with errors is then given by 7; € (0, nz, which satisfies 
no = 1 and 


noi = JT) N x :b(x, t - 1) — 1), pel een 


To develop some intuition, note that without errors a finite island of 0’s in a sea 
of I's gets eroded by Jr, as it is “squeezed” between two half-spaces with bound- 
aries parallel to £y. (However, this island may move in the process.) Thus (T) is 
often called the eroder condition. A natural question is what happens with such a 
rule under persistent introduction of low-density errors. 


THEOREM 5.1. Jf n(x) = Q, then there exists a Toom graph G = G(x,t), 
whose vertex set is included in ((z, s):s < t,z € Z^) and which satisfies the fol- 
lowing properties, for some sufficiently large C > 0: 


(1) The number of possible graphs G with m edges is at most C”. 

(2) For a graph with m edges there are at least m/C vertices which are error sites. 

(3) For any r = 0, if n:(y) = 0 for lx — y|| <r, then the number of edges of G is 
at least max{r/C, 3}. 


For the proof see [35]. 

In the classical application of Theorem 5.1, b(x, t) are i.i.d. Bernoulli(p). Then 
P (&(x) = 1) converges to 1 as p — 1, uniformly in (x,t). Thus & has an in- 
variant measure with density close to 1. This also follows when b(x,t) are not 
independent, but have uniformly bounded range of dependence in spacetime. 


LEMMA 5.2. Assume uj and u» are as in Lemma 4.5, with corresponding 
wedge W = Wu, .u,. Start a p-perturbation of T from Ag = Wh Z^. Consider the 
following two events: 


204 J. GRAVNER AND D. GRIFFEATH 


(1) Ex sm = {x is within distance M of A;}. 
(2) F = (there is a C so that, within the lattice ball of radius t?, (t — C log t)z + 


Then for any € > 0, there exists an M so that for any point x € tz + W the 
event Ex, m happens with probability 1 — & (uniformly in x and t). Moreover, 
FEY T. 


PROOF. Itis convenient to translate the dynamics so that W is fixed, that is, 
consider Aj = (A; — tz) N W. Also, rotate the lattice so that W has its maximum 
at the origin and u; and u» are situated symmetrically with respect to the y-axis. 
It then follows from Corollary 4.3, Lemmas 4.4 and 4.5 that there are finite con- 
stants C and fo (which again only depend on 7) so that the construction in the 
following paragraphs is possible. 

Cut a finite neighborhood of the origin with a horizontal line y — —C, and 
let £y be the time the deterministic dynamics needs to fill W again if sites above 
the cut are removed. Run the random system in multiples of time tọ, with the 
proviso that if any site at time nto is 0 above the cut, then all sites above the cut 
are set to 0 immediately. Also make all the sites above the cut O if during the time 
interval [(n — 1)t9 + 1, nto] a site within C of the sites of the cut does not become 
occupied because of a bad coin flip, that is, although the deterministic dynamics 
would make it occupied the random one does not. The resulting set of occupied ' 
points at time nto is called AJ. 

If an integer site x € W is not in Aj, then either an integer site in W strictly 
below the horizontal line through x must be 0 in A4, and a site on or above the 
horizontal line through x must also be 0 in A. ,, or else a site within distance C 
of x does not become occupied although the deterministic dynamics would make 
it occupied. In the latter case we call x an error site. It is clear that error sites have 
finite range of dependence in spacetime and occur with probability p’ which is 
above a fixed power of p and thus can be made arbitrarily close to 1. 

By Theorem 5.1, uniformly for sites x € W and n, 


(5.1) P(x is at distance at least M from Aj,) < > C" (py"/C < (Cp)M/C. 
mzM/C 


Now the claim concerning the event (1) readily follows. To prove that P(F) — 1, 
note that we can choose M = C logn (this C does depend on p), so that the prob- 
ability in (5.1) is below 1/n^ and thus 
P (some x € W N B(0, n?) is at distance at least C logn from A7) < C/n’. 
From local regularity and Lemma 2.1, it now follows that 
P((tz + W)n BO, 1°) £ Arcciogi) < C/t’, 


and Borel-Cantelli completes the proof. |] 


RANDOM GROWTH MODELS 205 


PROOF OF THEOREM 2 IN CASES 1 AND 2. Construct a convex set Ls in the 
following manner: 


Step 1. Start from nL, a multiple of the shape of the deterministic model. 

Step 2. Every corner of nL whose corresponding open edge of Kj/, contains 
directions u such that u/w(u) € 9 K' is “logarithmically rounded off" by 
introducing for each such u an edge with normal u and length C logn 
(where C is some large constant whose value will become clear below), 
and ensuring that the resulting set is a convex subset of nL. 

Step 3. Round off every corner of the set obtained in Step 2 to produce locally a 
translate of W = W,,.,, of Lemma 4.5. 


If R is the number of directions u for which u/w(u) € 0(co( K1,,)), then 
Step 3 has produced R wedges W, which we label Wo, ..., Wr_1. Start with some 
large Ly and couple (by using the same &, +) the resulting R + 1 dynamics: one 
started from integer sites in each wedge, and the last one started from those in- 
side tL y. 

If the percolation model introduced in the proof of Proposition 1.2 survives for 
all time on each edge of L;, then we call the coupling successful. This means that 
the state of each site is exactly the same as the state of the same site in one of the 
wedges. 

Fix an x € L,. By the FKG inequality and Lemma 4.7, the following three events 
simultaneously happen with probability at least 1 — € if M is large enough: 


(1) The coupling is successful. | 
(2) Ex s,m happens for a suitable wedge W; guaranteed by (1). 
(3) F happens for every wedge Wi, i —0,...,R— 1. 


For an arbitrary initial set which fills space, we once again use the fact that Ly 
is covered in finite time to get (S1) (2) (3) in Case 1, and (S1) (3) in Case 2. 

It remains to show that the a.s. deviations from a corner in Case 2 are at least 
logarithmic. Pick a corner o € L whose corresponding open edge of co(K1/,) 
contains u/w(u), for some direction u. Note that the boundaries of nL and 
nw(u)u + H; intersect at exactly to. Finally, consider the infinite wedge W de- 
fined by the corner and locate its vertex at the origin. 

The number of sites in 


Sy = W n((—kw(u)u + H, )N(—(k — Dw(u)u + H, )) 
is bounded above by Ck. Let 7; be a geometric random variable with success 


probability gz = (1 — p)!*#!, 
Start the dynamics from sites in M L, where M is arbitrary. It is clear that 


P (to is at distance at least ck from A,) > P(T} +--+ Tk <t) 
(5.2) 
> P(T(D +..-4 T9 <t), 


206 J. GRAVNER AND D. GRIFFEATH 
Where pU are i.i.d. copies of Ty. Now by Chebyshev 


k Var(T, 
(t—kEQD)  qk(tqk — k) 
and it is easy to check that when k = clogt for a sufficiently small c = c(p) the 
upper bound in (5.3) is O(t^?/?), The desired result now follows from (5.2) and 
Borel-Cantell. [L 


We note that logarithmic a.s. fluctuations (S3) are optimal in Case 1 of Theo- 
rem 2 as well, as any of the sites in (t + 1) LA£L can stay unoccupied for time c log t 
as a result of bad coin flips. This happens independently for each such site with 
probability t7 1/4 if c = c(p) is small enough. Since the number of such sites is lin- 
ear in ź, a large deviation computation shows that the probability that this happens 
for at least one site is at least 1 — exp (—c4/t ) [19]. 


6. An example. In this section we present an example of a one-parameter 
family of random rules 7p, p € [0, 1], for which we can compute the half-space 
velocities explicitly. Every such example seems to be similarly based on the mod- 
els introduced in [14] and [30]. Apart from the exactly stable cases, the exam- 
ple which follows therefore seems to be the only nontrivial instance of a random 
growth model with known shape. 

The best way to think about this model 1s on the hexagonal lattice, but we will 
describe it so that it fits into our Z^ setup. The model’s neighborhood V consists of 
seven sites, the von Neumann neighborhood with two added diagonal sites: (1, —1) 
and (—1, 1). Then z5($) = 1 when at least one of the following four conditions 
is satisfied: (—1,0) € S, (0,1) €e S, ((1, 21, (0, 2) C S, £(—1, 1), (1,0)] C. SS. 
On all other nonempty sets S, x5 (S) = p. This is a p-perturbation of the additive 
model, but not a standard one. Nevertheless we will denote the half-space veloci- 
ties by w, and the shapes by Lp. Supercriticality and local regularity are trivial. 

Note that zt interpolates between two supercritical growth models. When 
p= 1, the CA is additive with neighborhood W, which thus has Kj/», = V* 
[which is co(.N) rotated by 90°], and Li = co( W). When p = 0, the vertices 
of Ki/w are (0, +1), (—1, 1), (1, —1), (1, 2) and (—1, —2), while Lo is a parallel- 
ogram with vertices at (+1, 0), (—1/3, 2/3) and (1/3, —2/3). What sets this model 
apart from a generic example is that certain initial sets make it exactly solvable for 
every p,inthe sense that P (x € A+) can be expressed as a Fredholm determinant of 
an explicitly known operator on £? [14]. These initial sets are four wedges, which 
together cover the plane: W1 = HL, N H4, W2 = Hy NVA Wa m Hn Hi 
Wa = HZ, N H-,,, where eq = (1, 1)//2. 

Assume that Ao = W; N Z. The first observation is that A; = {(x, y):y < 
g1(x)}, where g; : Z — [—oo, t] is a nondecreasing function. This is easily proved 
by induction. As a consequence, whenever x has its east (i.e., right) neighbor in A;, 


RANDOM GROWTH MODELS 207 


it also has its southeast diagonal neighbor in A;. On this initial set, the rule is 
therefore symmetric across the line y = —x. Now let, for every positive integer n, 
h;(n) = gi(—t +n) — n. Then h,(n) = —oo for n < 0, h,(0) is a random walk 
which jumps by +1 (resp. stays put) with probability p (resp. 1 — p), and for n > 0 
hy41(n) equals Ah; (n) + 1 = h; (n — 1) automatically whenever A; (n — 1) > h(n). 
Finally, when A; (n — 1) < hi (n), hi1) = h(n) + 1 with probability p and oth- 
erwise Ah 4 1(n) = h;(n) + 1. This establishes the equivalence. It follows that there 
exists a self-invertible function $: [p, 1] — [p, 1], so that A;/t converges to the 
region Ly, = ((x, y) e R*: y < 1, —@(y) x x for p x y, and —1 <x for y < p). 
The function ¢ has the following explicit form [14]: 


o(y)=1-—p—(1—2p)y+2/ p — py — y). 


The argument is similar when Ag = W2 N Z*. (In fact, that this case is equivalent 
to the above can be seen by mapping the model onto the hexagonal lattice, where it 
has fourfold symmetry.) If e; (x) = sup[—y + x € Aj: y € Z} (with sup Ø = —oo), 
then A;(n) = g;(t — n) — n has the same evolution as the A; from the previous 
paragraph. It follows that this time A;/t converges to Lw, = {(x, y) € R*:y <1, 
x <o(y) — y for p < y, and x < 1 — y for y < p}. The remaining two wedge 
shapes are obtained by symmetry: Lyw, = —Ly, and Ly, = —Ly,. 

The proof of the next proposition is very similar to the proof of Corollary 1.1, 
and hence omitted. (See also [10].) 


PROPOSITION 6.1. Assume that a perturbation of a locally regular supercrit- 
ical CA is given by x. Assume that its initial set is a wedge: Ag = W N Z2, where 
W = Hi, N H; and uy and u» form an angle in (0, x). Then 


A 
= > ( {wr (u)u + H, :W C H, ) = (Kip, NWY, 
almost surely and in Hausdorff metric within any large ball of radius C. 
This proposition, together with Corollary 1.1, immediately implies that in the 
present example, 
Lp = Lw N Lw, N Lw, N Lw, 
and therefore its top half comprises points (x, y) which satisfy 
—é)zxzé(y)-» ifp<y<yo, 
-1zxxi-y, fO<y<p, 
where 
p. p > 1/2, 


yo = yo(p) = { 2(1 — p) 


D, 1/2. 
3— J8p p<1/ 


208 J. GRAVNER AND D. GRIFFEATH 





FIG. 5. Shapes Lp for p=0,0.1,..., 1. 


Hence the top half of the shape Lp is convex and G!, but not strictly convex, 
for p > 1/2; is strictly convex and €! only at p = 1/2; and for p < 1/2 is strictly 
convex with a corner at its highest point (—yo/2, yo) above the x-axis. See Figure 5 
for a plot. 

We note that fluctuations from the limiting shape in every direction, except 
+(—1/2,1) when p x 1/2, can be obtained from [14]. For example, consider 
a € (—1, (—yo/2) ^ (—p)) and let g; (x) = inf(y € Z: (x, y) € Ai). (It is easy to 
see that sites in A; with a fixed x-coordinate always form an interval if they do so at 
t = 0.) Then (g;(|oit ]) — $ (—a)0)/ t? converges in distribution to a nondegener- 
ate random variable. This follows from the fact that for such œ the evolution of A; 
starting from a finite set and from W|  Z? can be coupled so that the difference 
of respective g; (o!) is stochastically bounded. 


7. Exact stability for box neighborhood TG CA. The TG CA with box 
neighborhood of radius p has o(2p + 1) supercritical thresholds 0 [10], and the 
same number of corresponding Ki; » which we label K1,..., Kp(29+1), and su- 
perimposed with different shades in Figure 6. Let € = 65 = GAs an OKg. At 
first this set appears to be of bewildering complexity (cf. the range 5 example on 
the right-hand side of Figure 6). However, perhaps the first feature revealed upon 
closer inspection is that € consists entirely of straight lines, called K -lines, which 


RANDOM GROWTH MODELS 200 





FIG. 6. The 10 (resp. 55) supercritical K ws for range 2 (resp. range 5). 


extend through the entire picture. [In fact, if we were to include the critical K1/w 
for 0 = p(2p + D + 1,..., p (20 4- 1) + p all these lines would continue indefi- 
nitely.] There is one-to-one correspondence between the K -lines and the points of 
N \ (0), as we now explain. 

For any of the (2p + 1)? — 1 sites x € N \ {0}, start with a line through x and 0. 
Then rotate this line in the positive direction until it hits 0 again. Call all such 
rotations £4, 0 < @ < 2. The set of all cardinalities A(x, $) = |L7 (€x,9)| is 
exactly the set of 0 for which x € £u, for some u. Moreover, by Lemma 3.1, when- 
ever x is the only site in £y, V, this line determines a direction pointing toward 
the interior of an edge of K A (x 4), namely the direction of the normal to £x, which 
points toward the origin. When the number of points in £x, N N to the left and 
to the right of x are equal, the direction points to the interior of an edge in K for 
0 = A(x, @) — | {points to the left of x}. 

similarly, a line containing exactly two sites in M \ {0} determines a direc- 
tion in which exactly two 0 Ks meet (at a point which is a vertex of both, for 
one a convex vertex, for the other a concave one). A line containing exactly three 
points in M \ {0} determines a direction in which exactly three 0 Kg meet (but this 
point is a vertex of only two of them). And so on. We are now ready to prove the 
next proposition, which in particular allows unambiguous reconstruction of all Kg 
from 6. 


PROPOSITION 7.1. Two different 0 Kg intersect only in a discrete set of points. 
Moreover, all finite tiles of € are triangles or quadrilaterals. 


PROOF. The first statement follows since for all but a discrete set of rota- 
tions $, x is the only site in £x, N N and so the corresponding edge lies only 


210 J. GRAVNER AND D. GRIFFEATH 


in KA(,9). Fix a 09 and assume £,,4 = Lu for some direction u. If 2, contains 
two or more points in M, then 80 Ka, does not intersect 0Kg,+1 if and only if 
|L (£,)| = 6o. Let x be the rightmost point in £, N N such that Lx, = lu for 
some $. Decrease ¢ to the largest ø’ such that £, 4 N M contains at least two 
points. Clearly £, y N N is a line £, for 0 = 0o + 1 (and perhaps some larger 6's). 
A similar argument holds in the other direction, proving that, among two consecu- 
tive vertices of 0 Kg, at least one belongs to 9 Kg,1. This is clearly enough. O 


For any x, let A*(x) = infg |A (x, $)]. 


PROPOSITION 7.2. 0K’ includes a line segment if and only if 0 = A*(x), 
for some x. Moreover, for every x, A* (x) = |A(x, $)| for some $ such that £x ¢ 
exits N = Boo(0, p) c R? through the neighboring (rather than opposite) sides. 
Finally, each K -line includes exactly one line segment on 9 K', for some 0. 


PROOF. Pick an x and ġo so that x is the only site in £, 4, ÑN N, and let 
0 = A(x, $o). For this 0 and normal u to £x, 4,, Eu = £x,¢)- For the first assertion, 
it suffices to show that if 0 > A*(x), then u/w(u) cannot lie on d(co(Kg)). But if 
it would lie there, all of K1/, would have to lie on one side of the line of K1/ 
determined by x and $0 (see Lemma 3.1), meaning that A (x, $) > 0 for every $, 
a contradiction. 

For the second assertion, assume the said line exits the left- and right-hand sides 
of the square. We can also assume that x lies either strictly inside the third quad- 
rant, or on the negative y-axis. In both cases, rotate the line around x to angle @’ 
in the negative direction just past the southeast corner of the square; this results in 
A(x, $^) x A(x, Q) (with equality in the second case). 

For the final claim, assume that x is as in the above paragraph and that the 
line £x, produces A*(x). Now rotate it in the negative direction to the smallest 
angle $' for which the number of points on the left of x on £x,’ N N is larger than 
the number of points on the right. If a and b are the lengths of the line segments 
on £, g from the left and bottom edges of M, respectively, then a > b. This re- 
mains the case for any $" < $' and thus further rotations in the negative direction 
only lose more points. An analogous argument works for positive rotations. LI 


Therefore, each K -line contributes exactly one edge on exactly one 0K’. What 
produces a prevalence of exactly stable cases are those 0K’ which have more than 
their share of edges. For example, it is immediate by symmetry that the number 
of edges on any 8 K' is either 0 or at least 4. The number of 0 with lack of exact 
stability hence does not exceed ((2¢ + 1? — 1) /4 = p? + p, and therefore the 
number of exactly stable ones is at least p?. This argument is quite simple, yet it 
fails to produce any way to identify a single exactly stable case. Our next result 
remedies this somewhat. However, we do not have an algorithm which lists more 
than O (p) cases of either type. 


RANDOM GROWTH MODELS 211 


PROPOSITION 7.3. All 0 > 20? +1 are exactly stable cases. On the other 
hand, all 0 < p and 0 = p + 1 +i(2p + 1), i —0,..., p — 1, are not exactly 
stable. 


PROOF. For the second assertion, consider first 0 < o and take x = (—p + 
0 — 1,—p). For any angle ¢, A(x, ¢) > 0. This immediately implies that the 
edges in Kj» adjacent to the positive y-axis lie in 0K’. If 0 = p +1 +i(2p +1), 
i —0,..., p, there is instead a single edge, perpendicular to the y-axis. 

For the first assertion, it is enough to prove that all first-quadrant boundary edges 
of such Kg have slopes in [1, co]. Equivalently, take x € N in the third quadrant 
strictly above the line through the origin with slope 1, and a line £ through x with 
normal a negative rotation of ez by angle at most 2/4 and such that x is the only 
point in LN N. Then |.£~ (£)| < 0. This is certainly true when u is close to vertical, 
and further rotations can only decrease |£~ (Z)]. Ll 


Note that the proof of the above proposition shows that the 0 = 2, ..., p thresh- 
olds have at least eight edges in 0K’, which improves the lower bound on the 
number of exactly stable cases to o^ + p — 1. This is a good lower bound for 
small p, although the exact enumerations are taxing. 


PROOF OF THEOREM 3. By Proposition 7.2, we can reformulate the prob- 
lem as follows. Consider all integer points (a,b), 1 < a, b < p. Take a line £ 
through (a, b) which intersects the positive halves of both axes. For each such 
line, let A(£) be the number of integer points in the closed triangle T (£) C R? 
bounded by £ and the positive halves of the axes. Finally, let £* = £*(a, b) bea 
line which minimizes T (£). We need to find an upper bound for the number of 
different T (£) over all (a, b). In the following computations, O (1) refers to a term 
of arbitrary sign whose absolute value can be bounded by a constant independent 
of p. 

Let £f and £} be the line segments on £* from (a, b) to the x- and y-axes, 
respectively. The first observation is that £* can be chosen so that the lengths of 
£* and ey differ by O(1). (In fact, these lengths can be made arbitrarily close, 
although not necessarily equal.) In particular, the area of T (£) is 2ab + O(1). We 
will assume, without loss of generality, that the length of £y is not larger than the 
length of l: 

Now find an integer point (0, yo) on the y-axis immediately below where £* 
intersects the y-axis. Reflect (0, yo) through (a, b) to get a point (x1, y1) within 
O (1) of the intersection of £* with the x-axis. [Note that (x1, y1) lies within the 
closed first quadrant.] Also, let (xo, 0) be an integer point on the x-axis imme- 
diately to the left of where £* intersects it. Form the closed polygon II C R? by 
connecting (a, b) — (0, yo) > (0,0) — (xo, 0) > (x1, yi) > (a, b). 

The area of I is 2ab + O(1). Most importantly, the number of integer sites in 
TT is A(€*) + n*/2 + O(1), where n* is the number of integers in 9II which are 


212 J. GRAVNER AND D. GRIFFEATH 


not on the axes. This follows since one loses about half of these by a small rotation 
of the line from (0, yo) to (x1, y1) around (a, b). Therefore, 
* 
T (£^) = |(integer points in the interior of IT]| + > +2a+2b+ O(1), 
and by Pick’s theorem [1], 


* 
area of IT = |{integer points in the interior of I1]| + > a--b-- O(1) 


—T(£*)—a-b-4 O(1). 
It follows that 
T (£*) —2ab +a+b+ O(1) = 3Qa +1)(2b + 1) + O(1). 


Let My be the number of different products mn of integers m, n € [1, N]. It fol- 
lows that 
2 





log" p 

by the Hall-Tenenbaum sharpening of a theorem of Erdős ([17], Theorem 23). 
This ends the proof of the upper bound. The lower bound follows because the 
lower bound in [17], Theorem 23 is obtained using odd integers only. C 


8. Finalremarks. In this section we mention several results which are related 
to the main topics of the paper, sketch their proofs, and also complete the proof of 
Theorem 1 by removing log? t from the large deviation estimate. 


REMARK 1 (Continuous time). A standard continuous-time growth model A; 
is obtained by adjoining every site x at an independent rate 1 exponential time after 
the time Ty at which x € J (A,,). This process can be constructed in the standard 
way by attaching a Poisson process £, to every x. Theorem 1 is still valid in this 
case. The a priori large deviation bound however has log t replaced by log t. We 
now sketch the proof. 


Observe A; in discrete time units ¢ = 1,2, .... Change it to A! by making sure 
that between each time 7 and t + 1 no site at distance more than C logt from A’ 
gets occupied. As is easy to see by comparison with the continuous time additive 
dynamics having the same neighborhood, 


P(A, # A} within a lattice ball of radius 12) «173, 


when C is large enough. Now continue the proof with A}, which of course is a 
discrete-time monotone Markov process. Lemma 2.2 must be used C logt suc- 
cessive times to obtain the analog of estimate (2.2), and here is where the larger 
power of log originates. From this point the proof proceeds on familiar grounds, 
yielding existence of the asymptotic speeds w, while the continuous-time version 
of Corollary 1.1 establishes existence of the shape L. 


RANDOM GROWTH MODELS 213 


REMARK 2 (Approximating half-space velocities). Perhaps the most conve- 
nient method for simulating a growth CA started from a half-space is to use a strip 
with tilted periodic boundary. Take a vector u at angle $ € [0, 77/4] to e2. (By ro- 
tations and reflections it is clearly enough to consider these.) Then take a large L, 
and restrict the growth to the strip Hm = [0, M — 1] x Z. Let x = tan ġ. Given any 
configuration of occupied sites inside Hy, extend it to Hy = [-M,2M — 1] x Z, 
by identifying the state of (x, y) with that of (x — M, L(y ~«M)]) if x > M and 
with that of (x + M, | (y - k M)]) if x <0. 

Start from Ao = ((x, y) e Hm: y < kx}. Let the dynamics update sites in Hy 
with the specified boundary conditions until some large time ¢ when the inter- 
face apparently equilibrates. At this point, the average height above all points in 
[0, M — 1], multiplied by cos g/t, is a good approximation to Wy (u). 


Note that this is a much more efficient technique for computing the shape Lp 
than merely running the dynamics from a finite seed and observing the resulting 
blob. In particular, smoothness of L p is impossible to discern that way. The method 
outlined above, on the other hand, uses averaging to greatly reduce transversal 
fluctuations on the interface. The theoretical underpinning 1s partly given in our 
last theorem. | 

For a fixed M, and any x € [0, M — 1], let 


hl(x)-max(y:(x,ly—kx|]) € Ag, hl = max{h,(x):x e[0, M — 1]}, 
h(x) = min{y : (x, Ly —kx]) € Aj; h? =min{h,(x):x € [0, M — 1]). 


THEOREM 4. Fixane > O. If M is large enough, then with probability 1 
h? h} 
< liminf — < lim sup — < WiO E 
ioo f t5co É COS n 


cos $ 
uniformly in u. 


In fact, it is easy to show by subadditivity that h? /t and h1/t both converge a.s. 
to the same number as t — oo. 


PROOF OF THEOREM 4. We start by proving the lower bound. Note that 
the boundary effects spread with finite speed, as M is finite. Thus, until the 
time 79 — cM, the occupied sites on any vertical line through x «€ [0, M — 1] 
are above those started from an infinite tilted half-plane through (x, |«x]) or 
through (x, [kx | — 1). (We have to allow for the second possibility because there 
may not be a perfect match at the boundaries.) By the weaker form of Theo- 
rem 1, the probability that the lowest unoccupied site above a fixed x is below 
Kx + (wg (u)(1 — €)/ cos $)to is at most exp(—cto/ log? to). For n > 0, define this 
translation of Ao: 


Wy (u) — 22 


B, = [e 3: <kx+ 
oso 


214 J. GRAVNER AND D. GRIFFEATH 


and the event 
E = {B,C Ap}. 
It follows that 
P(E‘) < M exp(—cto/ log? to) < £, 
if M is large enough. 


Now run the dynamics until time fo. If E happens, restart the dynamics from 
the set B1; otherwise restart the dynamics from Bo = Ag. Then repeat from the 
possibly translated Ao. Let U, be the largest u for which B, C Ann. We have just 
proved that U,, dominates an n-step random walk which at each step increases by 1 
with probability 1 — ¢ and stays put with probability e. Therefore U, > (1 — 2¢)n 
with probability at least 1 — exp(—cn). This ends the proof of the lower bound (as 
uniformity in u follows because the constant cz in the weaker form of Theorem 1 
does not depend on u). 

The upper bound is proved similarly, except for the fact that we need an upper 
bound on the extent to which A; can propagate in fo steps. The trivial bound C fo, 
where C is the diameter of N, suffices. The comparison random walk now in- 
creases by | with probability at least 1 — ¢ and by C with probability £, and so its 
speed is 1+ Ce. L1 


Theorem 4 remains valid for continuous-time growth as well, with a similar 
proof, except for two significant differences. The first is that the coupling of finite 
and infinite systems only holds up to time cL with probability exponentially close 
to 1 in L [15]. The second is that the trivial upper bound at end of the proof is not 
available, so the forward jump of the comparison random walk is arbitrarily long, 
with exponential tail probabilities. It is clear that the proof is still valid under these 
conditions. 


PROOF OF THEOREM 1 (CONCLUDED). We choose u, M and tọ as in the 
proof of Theorem 4. Recall that t9/M is small and so during a time interval of 
length to sites at a distance larger than M cannot interact. Define 


Bni = (e 9M <x GE DM - 1 and y xx T0 n] 


and the event 

E= [Bi 0 C An}. 
As before, P (E€) < & if M is large enough. Now we define ñn (i) to be the maxi- 
mal k such that every site (x, y) with x € [/ M, (i-- 1)M — I] and y < "x€)- kto + 
X is in Ann. We can couple 7j, and the slow version of 7), from Lemma 3.3 (with 


p’ —1-— e), so that 7, (i) > 9, (1) for every i and n. (The Bernoulli random vari- 
ables b are probabilities of suitable translates of the event E.) It follows that, for 


RANDOM GROWTH MODELS 215 


every ô > 0 we can find a small enough e (which then dictates a large enough M), 
so that 


Wy (u) — E 
< (1 —8)—————— 
P(O, y) € A forall y « ) ve" 


This proves the exponential bound on the probability that A;/t lags significantly 
behind w,,(u). We will now show that it cannot progress significantly faster 
than w; (u) either. To this end, we redefine 


e) Slee 


A Wa (u) +E 
Bni = S EE E E D ER ag 


and 
E= [An E Bi,0]). 


so that again P(E°) < £e. In case E fails, the occupied sites above the interval 
[0, M — 1] cannot progress by more than the diameter of M. Furthermore, now 
jn (1) is the maximal k such that some site (x, y) with x € [1 M, (i + 1)M — 1] and 
y Rk + Mx OS nto c kx is in Any. Here R is a suitably large multiple of the 
diameter of M, which ensures that the fast version of 7), in Lemma 3.3 (now with 
p’ = €) dominates ña. Thus Lemma 3.3(3) completes the proof. CI 


REMARK 3 (Continuity of K1/,,). Assume a standard p-perturbation of T. 
As p changes from 0 to 1, K jy, varies continuously. To see this, assume that p’ is 
close to p, p’ < p and couple the systems with the two probabilities in the obvious 
way. To show that wy (u) is close to wp(u) (uniformly in u), one needs to look at 
the proof of the lower bound in Theorem 4. Between 0 and fp, the occupied sets 
in the two systems will not differ at all with probability (1 — (p — p’))©, which, 
as tọ is constant (albeit dependent on £), can be made larger than 1 — € if p — p’ 
is small enough. Once this observation is made, it is only necessary to follow the 
rest of the lower bound proof with p replaced by p'. 

Note that this continuity alone demonstrates that Lp has corners (1.e., is not dif- 
ferentiable) in every case not quasi-additive when p is close enough to 1 (although 
these corners may not move at the same speed, or even in the same direction, as 
the corresponding corners of £1). 


REMARK 4 (Shapes for small p). Again, assume a standard p-perturbation 
of 7. What happens as p — 0? Certainly Lp shrinks, and in fact 


1 - 
—Lp —> L, 
P 


the limit shape of the continuous-time growth model A;. To see this, let A/ 
be Ajr/p|- After a site sees a sufficient configuration in A; (resp. A+), it becomes 


216 J. GRAVNER AND D. GRIFFEATH 


occupied in a time distributed as 7, (resp. Te). It is easy to see that for small p, 
distributionally, 7, > Te(1 — p). Rescaling of continuous time immediately gives 
L/(—p)2 Lp/ p for small p. 

For the opposite direction, observe that Te + p > Tg, in distribution. The lower 
bound part of the proof of Theorem 4 for continuous time now shows that Ån D Bi 
with probability 1 — e. With the same probability, then, A; D By, at time t = tọ + 
CpLto < (1+ &)fo if p is small enough. (The added term is simply p times the 
number of sites in Bı \ Ao.) This easily completes the proof. 


In closing, let us pose two challenging conjectures based on experiment. 


CONJECTURE 8.1. Mx isa standard p-perturbation of a locally regular and 
supercritical CA with convex K1/w, then Ky/w is Strictly convex. 


This conjecture fails for nonstandard perturbations, as seen from the example 
discussed in Section 6. In fact, it fails for nonstandard perturbations even if we 
restrict to p very close to 1. A range 2 box counterexample is obtained by x(S) 
which is 1 when |S! > 9 and p when |S| = 8. A glance at Figure 6, together with 
results from Section 3, confirms that K1/w, cannot be convex for any p < 1. 

Figure 7 illustrates the application of Theorem 4 to two examples. The left 
frame depicts Kı [wp for the Moore TG CA with 0 = 3 and p = 1,0.9,...,0.4, 
while the right frame does the same for the range 2 box TG CA with 0 = 9 and 
p = 1,0.975, ..., 0.5. As guaranteed by Theorem 2, co(Ki/w,) C co(K1/w,) for p 
close enough to 1. What is more, the angles at the corners of Ki/w, approach 
the angles of K1;,; as p — 1. (This can in fact be proved by methods of the 


a i "di " ; 2 : x í 
P = a NS 9 A 


| 


a 

LE 

"E 

z 

Em 

a 5 h ï " 
s né 
PE wf t 


Á a 


GRN 
WE 'i 
A 





FIG. 7. Two examples of TG Ki jy pS 


RANDOM GROWTH MODELS 217 


present paper.) As p decreases, 0 Kj jw, Separates from 0 K1jy,, but Ky, Te- 


mains nonconvex. (Whether it may become Ct before the boundaries separate is 
not clear.) Upon further decrease in p one observes concavities gradually filling 
in, until K1/,, becomes convex. Such observations, as well as early belief in the 
asymptotic isotropy of Eden’s continuous-time random growth model [7], suggest 
our final conjecture. 


CONJECTURE 8.2. If p is small enough, the standard p-perturbation of T 
has strictly convex and smooth Kj; ,. The continuous-time version Kj is also 
strictly convex and smooth. 


Acknowledgments. We thank Gérald Tenenbaum for kindly sharing his ex- 
pertise on intricacies of the Linnik-Vinogradov—Erdós problem with us. Thanks 
also go to Timo Seppüláinen for pointing out an error in an early version. 


REFERENCES 


[1] AIGNER, M. and ZIEGLER, G. M. (2001). Proofs from the Book, 2nd ed. Springer, New York. 
MR1801937 i 
[2] ALEXANDER, K. S. (1997). Approximation of subadditive functions and convergence rates in 
limiting-shape results. Ann. Probab. 25 30—55. MR1428498 
[3] BOHMAN, T. and GRAVNER, J. (1999). Random threshold growth dynamics. Random Struc- 
tures Algorithms 15 93-111. MR1698409 
[4] BRAMSON, M. and GRAY, L. (1991). A useful renormalization argument. In Random Walks, 
Brownian Motion, and Interacting Particle Systems. Festshrift in Honor of Frank Spitzer 
(R. Durrett and H. Kesten, eds.) 113-152. Birkhauser, Boston. MR1146444 
[S] DURRETT, R. (1988). Lecture Notes on Particle Systems and Percolation. Brooks-Cole, Bel- 
mont, MA. MR940469 
[6] DURRETT, R. and LIGGETT, T. M. (1981). The shape of the limit set in Richardson’s growth 
model. Ann. Probab, 9 186—193. MR606981 
[7] EDEN, M. (1961). A two-dimensional growth process. Proc. Fourth Berkeley Symp. Math. Sta- 
tist. Probab. 4 223—239. Univ. California Press, Berkeley. MR136460 
[8] GRAVNER, J. (1999). Recurrent ring dynamics in two-dimensional excitable cellular automata. 
J. Appl. Probab. 36 492—511. MR1724804 
[9] GRAVNER, J. and GRIFFEATH, D. (1993). Threshold growth dynamics. Trans. Amer. Math. 
Soc. 340 837—870. MR1147400 
[10] GRAVNER, J. and GRIFFEATH, D. (1996). First passage times for discrete threshold growth 
dynamics. Ann. Probab. 24 1752-1778. MR1415228 
[11] GRAVNER, J. and GRIFFEATH, D. (1997). Multitype threshold voter model and convergence 
to Poisson- Voronoi tessellation. Ann. Appl. Probab. 7 615-647. MR1459263 
[12] GRAVNER, J. and GRIFFEATH, D. (1998). Cellular automaton growth on Z*: Theorems, exam- 
ples and problems. Adv. in Appl. Math. 21 241-304. MR1634709 
[13] GRAVNER, J. and GRIFFEATH, D. (1999). Reverse shapes in first-passage percolation and re- 
lated growth models. In Perplexing Problems in Probability. Festshrift in Honor of Harry 
Kesten (M. Bramson and R. Durrett, eds.) 121—142. Birkháuser, Boston. MR1703128 
[14] GRAVNER, J., TRACY, C. and WIDOM, H. (2001). Limit theorems for height fluctuations 
in a class of discrete space and time growth models. J. Statist. Phys. 102 1085-1132. 
MR1830441 
[15] GRIFFEATH, D. (1981). The basic contact process. Stochastic Process. Appl. 11 151—186. 
MR616064 


218 J. GRAVNER AND D. GRIFFEATH 


[16] GRIFFEATH, D. Primordial soup kitchen. Available at http://psoup.math.wisc.edu. 

[17] HALL, R. R. and TENENBAUM, G. (1988). Divisors. Cambridge Univ. Press. 

[18] HAMMERSLEY, J. M. and WELSH, D. J. (1965). First-passage percolation, subadditive 
processes, stochastic networks, and generalized renewal theory. In Bernoulli, Bayes, 
Laplace Anniversary Volume (J. Neyman and L. LeCam, eds.) 61—110. Springer, 
New York. MR198576 

[19] JANSON, S., LUCZAK, T. and RUCINSKI, A. (2000). Random Graphs. Wiley, New York. 
MR1782847 

[20] JOHANSSON, K. (2000). Shape fluctuations and random matrices. Comm. Math. Phys. 209 
437—476. MR1737991 

[21] KESTEN, H. (1993). On the speed of convergence in first-passage percolation. Ann. Appl. 
Probab. 3 296-338. MR1221154 

[22] KESTEN, H. and SCHONMANN, R. H. (1995). On some growth models with a small parameter. 
Probab. Theory Related Fields 101 435—468. MR1327220 

[23] KORSHUNOV, A. D. and SHMULEVICH, I. (2002). On the distribution of the number of 
monotone Boolean functions relative to the number of lower units. Discrete Math. 257 
463—479. MR1935742 

[24] LiGGETT, T. M., SCHONMANN, R. H. and STACEY, A. M. (1997). I Domination by product 
measures. Ani Probab. 25 71-95. MR1428500 

[25] MARCHAND, R. (2002). Strict inequalities for the time constant in first passage percolation. 
Ann. Appl. Probab. 12 1001-1038. MR1925450 

[26] MEAKIN, P. (1998). Fractals, Scaling and Growth far from Equilibrium. Cambridge Univ. 
Press. MR1489739 

[27] NEWMAN, C. M. and PIZA, M. S. T. (1995). Divergence of shape fluctuations in two dimen- 
sions. Ann. Probab. 23 977-1005. MR1349159 

[28] PIMPINELLI, A. and VILLAIN, J. (1999). Physics of Crystal Growth. Cambridge Univ. Press. 

[29] RICHARDSON, D. (1973). Random growth in a tessellation. Proc. Cambridge Philos. Soc. 74 
515—528. MR329079 

[30] SEPPALAINEN, T. (1998). Exact limiting shape for a simplified model of first-passage percola- 
tion on the plane. Ann. Probab. 26 1232-1250. MR1640344 

[31] SEPPALAINEN, T. (1999). Existence of hydrodynamics for the totally asymmetric simple 
K -exclusion process. Ann. Probab. 27 361-415. MR1681094 

[32] SETHNA, J. Equilibrium crystal shapes. Available at http://www.lassp.cornell.edu/sethna/ 
CrystalShapes/. 

[33] STEELE, J. M. (1997). Probability Theory and Combinatorial Optimization. SLAM, Philadel- 
phia. MR1422018 

[34] STEELE, J. M. and ZHANG, Y. (2003). Nondifferentiability of the time constants of first- 
passage percolation. Ann. Probab. 31 1028-1051. MR1964957 

[35] TOOM, A. L. (1974). Nonergodic multidimensional systems of automata. Probl. Inf. Transm. 
10 239-246. MR469584 

[36] TOOM, A. L. (1980). Stable and attractive trajectories in multicomponent systems. In Advances 
in Probability and Related Topics (R. L. Dobrushin and Ya. G. Sinai, eds.) 6 549—575. 
Dekker, New York. MR599548 

[37] WILLSON, S. J. (1978). On convergence of configurations. Discrete Math. 23 279-300. 


MR523078 
MATHEMATICS DEPARTMENT DEPARTMENT OF MATHEMATICS 
UNIVERSITY OF CALIFORNIA UNIVERSITY OF WISCONSIN 
DAVIS, CALIFORNIA 95616 MADISON, WISCONSIN 53706 
USA USA 


E-MAIL: gravner 2 math.ucdavis.edu E-MAIL: griffeat 9 math.wisc.edu 


The Annals of Probability 

2006, Vol. 34, No. 1, 219-263 

DOI: 10.1214/009117905000000387 

Q Institute of Mathematical Statistics, 2006 


LATE POINTS FOR RANDOM WALKS 
IN TWO DIMENSIONS 


By AMIR DEMBO,! YUVAL PERES,” JAY ROSEN? 
AND OFER ZEITOUNI* 


Stanford University, University of California-Berkeley, City University of New 
York—College of Staten Island and Technion and University of Minnesota 


Let J;,(x) denote the time of first visit of a point x on the lattice 
torus Ze = Zi nZ? by the simple random walk. The size of the set of 
a, n-late points Ln (a) = {x € Z2 :Ja(x) = o5 (n logz)?) is approximately 
n2l—@) for a € (0, 1) [Ln (a) is empty if a > 1 and n is large enough]. 
These sets have interesting clustering and fractal properties: we show that for 
P € (0, 1), a disc of radius nÊ centered at nonrandom x typically contains 
about n2 (0.—/8^) points from £4 (o) (and is empty if 8 < ./a), whereas 
choosing the center x of the disc uniformly in £n (æ) boosts the typical num- 
ber of æ, n-late points in it to n28(1—9) , We also estimate the typical number 
of pairs of a, n-late points within distance n® of each other; this typical num- 
ber can be significantly smaller than the expected number of such pairs, cal- 
culated by Brummelhuis and Hilhorst [Phys. A 176 (1991) 387-408]. On the 
other hand, our results show that the number of ordered pairs of late points 
within distance nf of each other is larger than what one might predict by 
multiplying the total number of late points, by the number of late points in a 
disc of radius n? centered at a typical late point. 


1. Introduction. Consider a simple random walk (SRW) on an n x n square 
with periodic boundary conditions (also called a lattice torus), run until the "cover 
time," when it has visited every point of the square. Our focus will be on the set of 
uncovered points shortly before coverage, which we call "late points." In an impor- 
tant paper, Brummelhuis and Hilhorst [1] pointed out that in two dimensions, this 
set has an interesting fractal structure. The main finding of the present paper is that 
the set of late points has an even more subtle fractal structure than that suggested 
in [1]. A significant reason for this is that a key random variable measuring the 
structure of late points, namely the number of pairs of late points within distance 
n? of each other, has a median and mean of different orders of magnitude. 


Received April 2003; revised July 2004. 
l Supported in part by NSF Grant DMS-00-72331 and by a US-Israel BSF grant. 
2 Supported in part by NSF Grant DMS-98-03597. 
3Supported in part by grants from the NSF, from PSC-CUNY and from a US-Israel BSF grant. 
4Supported in part by a US-Israel BSF grant. 
AMS 2000 subject classifications. Primary 60G50, 82C41; secondary 28A80. 
Key words and phrases. Planar random walk, cover time, late points, multifractal analysis. 


219 


220 DEMBO, PERES, ROSEN AND ZEITOUNI 


As noted in [1] this fractal structure is not present in three or higher dimensions, 
where at the scale of power laws the set of uncovered points resembles a uniformly 
sampled random set of the same size. 

We proceed to a more quantitative discussion. Consider the SRW on the lattice 
torus Z2 = Z? [n7 starting at the origin. If x € z we let 7, (x) denote the time 
it takes the walk to first visit x. Let 7,, = max, c72 J5(x) denote the time it takes 


the walk to completely cover Z5 In [4], Theorem 1.1, we showed that 
Ja 


(1.1) m lora = m in probability. 

(Contrast this with the typical hitting time of a fixed point x € Z?, which is of order 
2 

n^ logn.) 


We say that x € Z2 is w, n-late for some 0 < a < 1 if 
4 2 
Sn (x) = ae logn)*, 


and set 4^, (o) to be the set of a, n-late points in i Sel An adaptation of the arguments 
in [4] reveals that |.£,,(@)| œ n?-?* in the following sense. 


PROPOSITION 1.1. Forany0 «o «1, 


Lal 
(1.2) tim LEEW _ 2(1—a) in probability. 


n-—2oo logn 


If La (œ) were spread out uniformly in Z- , one would expect that for any x € Te 
and a < B < 1 we would have |£,(@) N D(x, nP)| e n2P—20 The next two theo- 
rems make precise the idea that the set £,,(a@) does not look like an independent 
uniform drawing of n?-7* points in Z, in the sense that |.£, (à) N D(x,n?)] & 
n?P—2a/P for a typical x, whereas it is œ n2P1—9) for most x € J£, (a). 


THEOREM 1.2. Forany 0 <a < f? « 1 and à » 0, 


lon - (28 - 2ajp)| > 5) =0. 





(13) lim max »( 
n> rez? 


In particular, for any 0 < a, B < 1 and any nonrandom sequence x, € Z 


log |, (ar) N D (Xn, n?)| 


= max(28 —2a/B,0) in probability. 
logn 


(4) Jim, 

As stated already, the fractal nature of |.£, (o) | is described by the next theorem 
that shows the clustering of late points; in the neighborhood of a "typical" o, n-late 
point there is an “unusually large" number of a, n-late points. 


LATE POINTS FOR RANDOM WALKS 221 


THEOREM 1.3. Forany 0 « a, B < 1 and ó >Q, 





log |£n (œ) N D(x, nP)| 
lim max P( Sr 
N00 x eZ2N(0) logn 
(1.5) 
-28ü —a)| > |x e £,6)) = 





Further, choosing Y, uniformly in Ln (a), 


L 
(10) p logn 


—2p(l—a) ` in probability. 


The predictions of [1], which motivated our work, are related to another de- 
scription of the clustering properties of J£, (o), obtained by focusing on pairs of 
late points. 


THEOREM 1.4. Let0 «a, p « 1. Then 
I £? (a) : d < pP 
(L7) tim Ele WE fp): y) Sm _ oy B) in probability, 
n-—-oo log A 
where 
2 +26 — 4a/(2— P), if 8 x2(1— Ja), 
2 i 

8(1— va) -4(1-Va)’/B,  ifBz2(- va). 

For the mean number of pairs of a, n-late points within distance n? of each 
other, Brummelhuis and Hilhorst ([1], (3.36)) obtain different growth exponents 


2-28 —4e/Q—B), | if8x2— 2a, 


(1.8) pla, p) = | 


(1.9) (a, B) = | 


6 — 442a, if 8 —-2-— 2a. 
As we explain below, the functions 
(10) Fi, (7 = =r, 2B 


of y > 0, with h a nonnegative integer, play an important role in the study of late 
points. It can be easily checked that 


(1.11) p(a, B) =2+ 28 — 2a y E F5 p(y), 
where 

(1.12) Pop = (iy 20:2 — 28 — 2a Fo,g(y) = 0) 
(see Section 9). It is also easy to verify that 

(113) — Pla, B) = sup sup{2 + 28' — 2a Fo, g(y)}, 


B'xpyzO 


222 DEMBO, PERES, ROSEN AND ZEITOUNI 


so the difference between (o, B) and p(a, B) is that the supremum in (1.13) is 
not subject to the constraint that y € yg. As explained below, this constraint 
differentiates the median number of pairs of a, n-late points within distance n? of 
each other, easily obtained from (1.7), from its mean (found already in [1]). 

The key to our approach lies in the following heuristic picture relating the late- 
ness property to certain excursion counts for the random walk: fix an appropriate 
sequence of increasing radii rg, k = 1,..., kn, with rgk41/rk ^v re/re-1, ro = 1 
and rg, <n, and count the number of excursions N,(k) between D(x, ry 1) 
and D(x,ry). A point that has many fewer than the typical number of excur- 
sions between these levels, by time 4a(nlogn)*/z, is also extremely likely to 
be a, n-late (see Lemma 4.1). Further, a typical x € £,(a@) has an atypical profile 
of excursion counts, determined approximately by considering a one-dimensional 
simple random walk on the set {1,...,k,}, started at k,, and conditioned not to 
hit 1. Thus, not only is the point x not hit by the random walk, but in fact a neigh- 
borhood of it is visited less often than it would have been otherwise, and this 
creates a large cluster of o, n-late points in a neighborhood of such x. 

Large deviations estimates for this one-dimensional walk imply that certain 
a, n-late points x have a much smaller number of excursions N,(k,) between 
discs in an intermediate scale k,, forcing an accumulation of many a, n-late points 
in D(x, rz ). In more detail, for rz, + n2, the probability of N, (kn) being near the 
value typically associated with o'y?, n-late points is about n^ ^* £0.50?) , Given such 
a value of N, (kn), the probability that x is an o, n-late point is about n^ 2«Y B. Con- 
sequently, the probability of x being o, n-late with N, (Kn) near the value typically 
associated with an ay’, n-late point is about n-20FopV)y-20v7B — ,-2aF 1,60) 
and if we require that also a specific y of distance ~ nÊ from x is a, n-late, the 
probability is further reduced to about n- 29 P1507 -2Y^8 — 4-20.60) The con- 
straint y € I'y,g in (1.11), which is missing in (1.13), represents the range of values 
of N; (kn) possibly found when examining all O (n^-7P) centers x of discs of ra- 
dius n? that cover the torus Z2. Indeed, due to this constraint, the median of num- 
ber of pairs of o, n-late points within distance n of each other is about n?™?), 
whereas the mean of this variable is of the different order of magnitude n°), 

The value of o(o, B) is obtained by taking y € I'o,g for which the probability 
of locating specific pairs of a, n-late points is maximal. This value of y coincides 
with the unconstrained minimizer of F2, 8(-) if and only if B < 2(1 — /a), thus 
explaining the jump of d^p/df? at B = 2(1 — Ja). It is never the same as the 
typical y = 1 [i.e., the minimizer of Fi,5()], which one finds in most discs of 
radius n? centered at a, n-late points. Hence, y = 1 controls the exponent of The- 
orem 1.3. In contrast, the exponent of Theorem 1.2 is controlled by y = 1/8 [i.e., 
the minimizer of Fo,g(-)], found in most of the O (n7) discs of radius n? that 
cover Z2. 


LATE POINTS FOR RANDOM WALKS 223 


Organization. After a short section which collects some facts about the SRW, 
our paper is divided into three parts. The first part is about “global” properties of 
the set of o, n-late points. It consists of Sections 3—5, where, adapting the argu- 
ments of [4], Sections 2, 3, 6, 7, to the context of simple random walk, we prove 
Proposition 1.1 and lay the groundwork for all other results. The second part deals 
with clustering of late points. It starts with the large deviation probability bounds 
of the form n^ ?* P5*Y). given in Section 6, which are key to our upper bounds, and 
moves on to the proofs of Theorems 1.2 and 1.3. The third part of the paper deals 
with Theorem 1.4 about pairs of o, n-late points. Applying the bounds of Sec- 
tion 6 we derive the upper bound in Section 9, where we also solve the variational 
problem (1.11), with the complementary lower bound derived in Section 10 by a 
refinement of the construction of Section 4. In the final Section 11 we describe 
possible extensions of our results. We note that the arguments in this paper are 
based on direct analysis of the random walk, rather than a strong approximation 
argument with Brownian motion. 


2. Random walk preliminaries. Let $,, n > 0, denote a simple random walk 
(SRW) in Z and let X,, n > 0, denote SRW in Z4. In this section we collect some 
facts about $,, n > 0, and Xn, n > 0. We adopt here and throughout the paper the: 


CONVENTION. Throughout, a function Z(x) is said to be O(x) if Z(x)/x is 
bounded, uniformly in all implicit geometry-related quantities (such as K). That is, 
Z (x) = O (x) if there exists a universal constant C (not depending on K) such that 
|Z(x)| x Cx. Thus x = O(x) but Kx is not O(x). A similar convention applies to 
the symbol o(x). 


Let D(x,r) = (y € Z?: ly —x| <r} where |z| denotes the Euclidean norm of z. 
For any set A C Z? we let 0A = (y € Z?:y € A^, and infxea |y — x| = 1) and 
A = AU 0A. For any set B C Z? let Tg = inf{i > 0:5; € B} and Tp = inf(i > 
1:5; € B). For x, y € A define the truncated Green function 

oO 
Ga(x, y) 2 Y E" (S; =y,i < Taa). 
i=0 
We have the following result which is Proposition 1.6.7 of [7]. For any x € D(0, n) 


log(n/|x|) + O(|x|~! + dogn)—') 


2.1 P" (Ty <T. = 
(2.1) (To < Tap(0,n)) Mm 
and 
2 " -l „~l 
(2.2) Goom, 0) = Zlog( Z) + Oxi +07). 


We next note formula (1.21) of [7]: Uniformly for x € D(0, n), 
(2.3) n^ — |x|? < E (Tap) < (n - 1? — Ixl’. 


224 DEMBO, PERES, ROSEN AND ZEITOUNI 


We also have the result of Exercise 1.6.8 of [7]: Uniformly in r < |x| « R, 
log(R/|x|) + O (r^!) 
log(R/r) 
Define the hitting distribution of the boundary of A by 
Haa (x, y) = P*(Sr,, = y). 
We have the following Harnack inequality. 


(2.4) P" (Tsp@,r) < Tap(,5) = 


LEMMA 2.1. Uniformly for 8 < 1/2, x, x' € D(0,8n) and y €89D(0,n), 
(2.5) Hanon (x, y) = (1 + O(8) + O(n™")) Haven)’, y). 
Furthermore, if 5’ « 8 are such that 


" min m Fad.) < Tano,a'n)) = 1/4, 
then uniformly in x € 9 D(0, ôn) and y € 9 D(0, n), 
TA P* (Sr, p, = Y: TaD(0,n) < TaD(0,sn)) 
= (1+ 0(5) + O(n- D)P" (Tap(o,) < Tap(o,sn)) Hop, Y). 
PROOF. By Lemma 1.7.3 of [7], for any y € 9D(0, n) and ó < 1/2, 


Hano, x, y) = 2; P? DOOR - z)G p.n. x). 
zc8D(0,n/2) 


But 
G p(o,1—5 (Zz — x, 0) € Goon) (2, x) € GDO, asy G — x. 9) 

and by (2.2), with |z — x] = n(1/2 + O(8)), 

2 (S + ô)n 





G p(0,(1:5)n) (z — x, 0) = — log ) + O(n!) 
1 Iz — x| 


Au^ 1 «à E 
= rea o5) * ^ 


and (2.5) now follows. 
Turning to (2.6), we have 
ih P* (Sr, po, = Y: TaDO,n) < Tap(,s'n)) 
= Hapo, m y) — P* (Styne) = Y: TaD(0, > Tap(,sn))- 


By the strong Markov property at 73 p(0,5/n). 


(2.8) P" (Stanom = Y» Top, > Tano.) 
= E" (Ha po,n) (ST, posin» Y); TaDO,n) > Top(,»)- 


LATE POINTS FOR RANDOM WALKS 225 


Since 3 D(0, ôn) separates 9.D(0, n) from 3 D(0, ó'n), by the strong Markov prop- 
erty and (2.5), uniformly in w € 3 D(0, à'n), 


Hap@,n)(w, y) = E" (Hap@,n)(Stap0,4> Y) 
= (1+ O(8) + O(n^) Hapon Œ, Y). 
Substituting back into (2.8) we have 
P* (Sr, pn) = Y: Tad0,n) > Tad@,8n)) 
= (1 + O(8) + O(n7!))P* (Tap > Tan (o,s'n)) Haven), Y). 


Combining this with (2.7) and the assumptions of the lemma, used to control 
the error terms, we obtain (2.6) which completes the proof of the lemma. 1 


Combining the above with Lemma 1.7.4 of [7] we see that if un denotes uniform 
measure on 8 D(0, n), then for all ô < 1/2 and some constants 0 < c = c(8) < C = 
C (8) < co we have that uniformly for x € D(0, én), 


(2.9) Cn C) € Habo, n (x; +) € Cun). 

Let HA(z, x) = P(X, = x) be the hitting measure on A C Zz by X, with 
T4 and rA the corresponding hitting times. When dealing with X,, sets such as 
D(x, r) and 8 D(x, r) are defined with respect to the L?-distance d(., -) in ZZ. 


LEMMA 2.2. Uniformly in K,z,z € ƏD(0, R) and x €8D(0,r) with Ar < 
R < K/2, 


~ r RNV a 
(2.10) Hanco,r)(Z, x) = (: +0 (= log ~)) Ha D(0,r) (z’, x). 
Furthermore, if 4r < R < R' < K/2 are such that 
° 7 / 
E. P*(T3p0,r) < Tapo, r) = 1/4, 
then uniformly in z € 9D(0, R) and x €8D(0,r), 


P((Xp =x; Tipo < Tipi pn) 
Tp, ^9 Tanor) < Tap(o Ro) 
(2.11) ix 


r R F 
7 ( pe t log :Ji P*(T3 0,7) < Tape, e) Hon.) Go x), 


and if in addition r~ = O(4), then uniformly in z,z' € 8D(0, R) and x € 
8 D(0, r), 


P* (IT on =x; T3 DOr) < T5 p(o, n») 


(2.12) R 
r / 
E ( dili t 108 :)* Unio, =* Too, < Tope.) 


226 DEMBO, PERES, ROSEN AND ZEITOUNI 


PROOF. The bounds of (2.10) will follow immediately from the fact that uni- 
formly in z € 9D(0, R) and x € 3D(0, r), 


Aopen (z, x) — (1 +0 (= log =) 
(2.13) 
p (Ta (o, n^ Typo, R/2)) 
"X '<aD(0,r) P" (Typ, rn^ Tso, np) 

This is the equation above Theorem 2.1.3 of [7]. However, since that equation 
deals with the simple random walk in Z^ and f D(0,r) (z, x) involves paths for 
which the difference between Z^ and Li, might be significant, we next explain 
why the same proof works for Zi. 

The proof of Lemma 2.1.1 ob [7] shows that, with A = dD(Q,r), B = 
ə D(0, R/2) and z e8D(0, R), 


Yves Gp, (Z v) Haun(v, x) 

2 ssp C D(.ry (z, v)P" (T; << Tp) 

with G Dvo,r) Z v) the Green's function for D(O,r)°, the complement of D(0, r) 
in Z7. But this gives 


Ha(z, x) = 


H H ; 
(2.14) min SUO < Hi, (z, x) < max oe. 
veB P"(T, < Tp) eB P"(T, < Tp) 


Note that B = 9D(0, R/2) separates A = 9D(0,r) from the complement of 
D(0, R/2) in Ei, Hence, the above max and min involve expressions that 
are determined by paths confined between A = 9 D(0, r) and B = 3 D(0, R/2), 
which are thus the same for the simple random walks in Z* and in Z. Conse- 
quently, (2.14) is precisely the top inequality on page 49 of [7], from which (2.13) 
follows. This completes the proof of (2.10). The bounds of (2.11) follow from 
(2.10) in the same way that (2.6) follows from (2.5). Finally, combining (2.10), 
(2.11) and (2.4) leads to (2.12). O 


We next show that for R’ >> R >> r > 1, the o-algebra of excursions of the path 
from 8 D(0, r) to 0 D(0, R), prior to T3p(o, g^, is almost independent of the initial 
point z € 9 D(0, R) and the final point w € 3 D(0, R’). 


LEMMA 2.3. For 4r < R < R' < K/2 and a random walk path starting 
at z € D(0, R), let J€ denote the o-algebra generated by the excursions of 
the path from 9D(0, r) to 9D(0, R), prior to T3p(o, gm. Suppose r | = O(4) 
and log(R'/ R) > (1/4)log(R/r). Then, uniformly in K, z,z € 8D(O, R), w € 
80 D(0, R^) and Be Jt, 


(2.15) P^(B|X, e gy = W) = (1 + o( s ))ro» 


LATE POINTS FOR RANDOM WALKS 221 


and 
r R / 
(2.16) P*(B) — ( + o(5 log -Jr (B). 


PROOF. Fixing z € 3 D(0, R) it suffices to consider B € J£ for which 
P*(B) > 0. Fix such B and a point w € 9D(0, R^). Let to = 0 and for i — 0, 1, ... 
define 


T2j4.1 = inf(t > Tz; : Sı € 0D(0, r) U8 D(0, RY}, 
Di42 = inf(t > 75;41:5$; € ADO, R)}. 


Abbreviating T = 75p(o, g^, note that T = t2741 for some (unique) nonnegative 
integer J. For any i > 1, we can write (B, I =i} = (Bi, to; < t) '1((1 = 0} o Or) 
for some Bj € Fry, so by the strong Markov property at Tz, 


E*[X; = w; B, I =i] = E*[E* (X; = w, I —0); Bi, vj < T] 
and 
P*(B, I =i) =E*[E*2 (I 20); Bi, vj; < t]. 
Consequently, for all i > 1, 
E [Xz = w; B, I =i] 


(2.17) 
E*(X; =w; I =0) 


> Z — 1 . 
>P*(B, I Qm E*Q — 0) 


Necessarily P*(B|J = 0) € (0, 1} and is independent of z for any B € #, implying 
that (2.17) applies for i = 0 as well. By our assumptions about r, R, R’, (2.4), (2.5) 
and (2.6) there exists c < oo such that for any z, x € 3 D(0, R) and w € 8D(0, R^), 


E” (Xz = w; I =0) > (1 — cR/R)E (I — 0)Hap(o, nn (z, w). 
Hence, summing (2.17) over 7 — 0, 1,..., we get that 
E*[Xz = w, B] > (1 — cR/ RP" (B) Hopq, R) G, w). 
A similar argument shows that | 
E*[Xz = w, B] x (1 + cR/ P')P*(B) Hap(o, n^ (z, w), 


and we thus obtain (2.15). 
By the Markov property at 71, for any z € 0 D(0, R), 


P*(B) = P*(B,I —0) 


+ »3 Aa p@,rnuaD@,R (Z, x)P* (B). 
xedD(0,r) 


228 DEMBO, PERES, ROSEN AND ZEITOUNI 


The term involving (B, J = 0) is dealt with by (2.4), and (2.16) follows by (2.12) 
and our assumptions about r, R and R’ values. O 


Building upon Lemma 2.3 we quantify the independence between the o -algebra 
$* of excursions from 8 D(x, R’) to 9 D(x, R), and the o-algebra J€* (m) of ex- 
cursions from 0 D(x,r) to D(x, R) which occur during the first m excursions 
from 8 D(x, R) to 9 D(x, R^). To this end, fix 4r < R < R' < K/2andx € Z^. let 
To = 0, and fori = 1,2,... define 


ti = inf{t > Tj-1:X; € 9D(x, R)}, 
Tj = inf(t > tj: X; € 3 D(x, RY). 


Then $* is the c-algebra generated by the excursions (eU), j = 1,...}, where 
eD) = {X;:7j-1 <t < Tj} is the jth excursion from 89 D(x, R^) to 9D(x, R) (so 
for j = 1 we do begin at t = 0). We denote by #*(m) the o-algebra generated 
by all excursions from 8 D(x, r) to 9 D(x, R) from time r, until time Tm. In more 
detail, for each j = 1,2, ..., m let ¢ ; o = vj and for i =1,... define 


tj; —inf(t > £;; 4:X:; €8D(x,r)), 
Ç ji —inf(t > £j: X € D(x, R)). 


Let v;i —(Xr:£ji Sf < tjl and ZÍ = sup{i > 0:54 < Tj}. Then, J£" (m) is 
the product o -algebra generated by the o -algebras JO =o(vji,i=l,..., ZÍ) of 
the excursions between times v; and T}, for j =1,...,m. 


LEMMA 2.4. There exists C < oo such that uniformly over ./R < 4r < R < 
R' < K/2 with log(R'/ R) = (1/4)log(R/r), all m x R/(r log(R/r)), x, yo. y1 € 
Zi, and A e J€^ (m), 


(1 — Cm- log Sje (A) x P (A|g*) 
(2.18) R 
r 
< (1 + Cm log ~)p (A). 
r 


PROOF. Applying the monotone class theorem to the algebra of their finite dis- 
joint unions, it suffices to prove (2.18) for the generators of the product o-algebra 
J€* (m) of the form A = Aj. x A2 X --- X Am, with Aj € He for jJ = lat: 
Conditioned upon $* the events A; are independent. Further, each A; then has the 
conditional law of an event B; in the o-algebra #€ of Lemma 2.3, for some ran- 
dom zj = Xr; —x € 0D(0, R) and wj = Xz, — x € AD(, R^), both measurable 
on $*. By our conditions on r, R and R’, the uniform estimates (2.15) and (2.16) 


LATE POINTS FOR RANDOM WALKS 229 


yield that for any fixed z’ € 8 D(0, R), 


m 
P? (A, x A2 X+- x Am|G*) = | [ PY (BjIX, o.r =w) 
j=! 


ii R 
(2.19) - Ta o(; jp (Bj) 
iu R : 


yY R Ju / 
Z 
=(1+0(Z10g%))} Up (Bj). 


Since m < R/(rlog(R/r)) and the right-hand side of (2.19) depends neither on 
yo€ Ze nor on the extra information in $*, we get (2.18) by averaging over $^. 
L] 


REMARK. Lemma 2.3, which deals with the path of the walk in D(0, R^, 
applies for the simple random walk 5; in Z^. Consequently, by the same argument 
as above, the bounds of (2.18) also apply for Sn. 


3. Hitting time estimates and upper bounds. For any first hitting time T we 
set ||T || = sup, E" (T). By Kac's moment formula for the strong Markov process 
X, (see [5], (6)), we have for any n and y 


(3.1) E? (T^) <a (TT. 


Throughout this section, consider constants r, R such that 0 < 2r < R < 5K 
Per fixed x € Z , we let 


(3.2) tO — inf(r > 0: X, €9D(x,r)], 

(3.3) o) — inf(t > 0: X,, 9 € 9D(x, R)), 

and define inductively for j = 1,2,... 

(3.4) TP = inf(t x oU : Xue, €0D(, r)], 

(3.5) o J+) — inf(t > 0: Xr; € 3D, R)], 

where T; = i» t for j =0,1,2,.... Thus r®?, j > 1, is the length of the jth 
excursion 6; from 9 D(x, r) to itself via 3 D(x, R), and oY) is the amount of time 
it takes to hit 3 D(x, R) during the jth excursion &;. Hereafter, we set t = rt and 


use the abbreviation dr = Ə D(x, r). 
The following lemma will be used repeatedly. 


230 DEMBO, PERES, ROSEN AND ZEITOUNI 


LEMMA 3.1. There exists c1 < oo such that for all 1 > n > c\(1/r +r/R) 
and R < K/6, 


2 R 
(1— n)—K*log(—) < min E*(r)< max E (t) 
T r x. yezo, x, yeZ% 


(3.6) > R 
é(14+n)—K? log(—), 
IT r 


R 
3.7 EY (T; « cy K^1 (=), 
(3.7) bcm (Tap) < cK^log : 


and for all r > c1, 


K 
(3.8) max ||T3p¢x,r) || x eiK? log(—). 
xeZz r 
PROOF. Let Xp be distributed uniformly on Z^. Then {X;} is a stationary and 
ergodic stochastic process. By Birkhoff's ergodic theorem we then have that 


"o 1 
Jm F 2. lgjX)- v as 
Thus, with $. 4 — 0, 


c D marg an lp((Xi4os,,) 1 
(3.9) Jim Tne) a.s. 
j=07 

For j > 1 set Z; «:U— EP (TD | Fg.) =r — E^*;-1 (r), where p is uni- 
form measure on Z5. By the strong Markov property we see that {Z;} is an or- 
thogonal sequence. Since any irreducible Markov chain with finite state space is 
positive recurrent, we have that || T3» ||, || Tan || < oo, and using (3.1) we see that the 
sequence (x) and hence {Z j} has uniformly bounded second moments. It fol- 
lows from Rajchman's strong law of large numbers (see, e.g., [2], Theorem 5.1.2) 
that 
(3.10) TENE y (iO —-E*5-1(r)}=0 as 

n-00g I 


Similarly, set co = r® and for j > 0 let 


z0) M, 
Y; = X Ips (Xisj) m D lix} (Xi+T; ,). 
i=0 i=0 


~ X : 
Y; = Y; — E (Yj|$s, ,) = Y; - E 573 (Y1). 


LATE POINTS FOR RANDOM WALKS 231 


By the strong Markov property (Y j} is also an orthogonal sequence, and since 
Y; < 1), the sequence (Y;) also has uniformly bounded second moments. Thus, 
B Rajchman's strong law of large numbers, 


-" Xs. 
(3.11) m (208 *j33(Y]) 2-0 — as. 
j= 


It follows from (2.2) that for some finite universal constant co > 1 and all 1 < 
r X R/3 < K/6, 


2 R 
log) — cor! < min min E? (Y1) 
y r year 


(3.12) < max max E? (Yi) 
X yéor 


2 (£) A 
< — logi —} +eco. 
It r 
With r® finite, we get by combining (3.9), (3.10) and (3.11) that almost surely, 


n Xg, 
lim Gi) Ej BE) — eat 
BP) a 105) 


Consequently, in view of (3.12), for some finite universal constant cı and all 
1-n0zc(ü/r-tr/R), 


2 R 
=(1 -— 7) K*to8(— ) < max I? (t), 
m 3 r year 
(3.13) 
2 3; [R 
min E'(z) < — cci 3 K log| — 


ye 


For y € ðr, we have t® = 0 and by the strong Markov property at the stopping 
time o (U, 


(3.14) E? (r) =E (Tar) + Y^ Hon, 2E? (Tor). 
zcaR 


Thus, enlarging co as needed, it follows from (2.3) and Lemma 2.1 that for all 
1 <r x R/oo, 


(3.15) max I? (t) < (1 4 coz) min IE? (7). 
yeàr R/ year 


Taking also c1 > 3co, we get (3.6) by combining (3.13) and (3.15). 
Turning to prove (3.7), consider (3.14) for y € dr and 3R instead of R. Then, 


232 DEMBO, PERES, ROSEN AND ZEITOUNI 


by (3.6) and (2.9), 
(3.16) c(1/3)E“3 (Tar) < 2K?log(3R/r). 


Using the strong Markov property, (2.3), (2.9) and (3.16), we thus have that for 
any y c QR, 


E? (T3,) x E (Taar) + E? (Tar — Tas; Tar > Ta3r) 


(3.17) « GR 4-1) + c(;)E" (Tar) 


<c&K? 'og(Ž), 
= 
for some universal cz < oo and any r, R as in the statement of (3.7). Making sure 
that c1 > c2, this completes the proof of (3.7). 
To prove (3.8) we use the bound (3.17) when the distance of y from x is between 
Ro — r/c1 and K/6, and that of (2.3) when y € D(x,r). As for y € D(x, Ro) N 
D(x,r), since 


E? (Tar) < E? (Tag) + max E*(75,), 
z€0 Ro 


we get the stated bound by combining (2.3) (for the first term above) and (3.17). 
Finally, fixing y € ZZ, V D(x, K/6), we establish the bound of (3.8) by noticing 
that the value of E’ (T5,.) for the random walk on Zi x 18 then nondecreasing in £, 
and adjusting cı accordingly (to accommodate the use of, say, Zs x) U 


The following lemma, which shows that excursion times are concentrated 
around their mean, will be used to relate excursions to hitting times. 


LEMMA 3.2. With the above notation, we can find 5) > 0 and C > 0 such that 
if R < K/2 and 8 < o with 8 > 6c,(1/r + r/R), then for all x, xo € Z5, 


N 2 
p? (>: r) eus p E ERU y) 


j—0 
(3.18) i 
< g- C8 (log(R/r)/log(K/r))N 
and 
E 2K? log(R/r) 
P*[3/:0 x (1.3) ^N 
i=0 
(3.19) 


< e~C8*(og(R/r)/log(K /r))N 


LATE POINTS FOR RANDOM WALKS 233 


PROOF. Witht=1 = {Tar + Tar o 07,4] © O7,,, clearly 
max EY (z") < max E?" ({Tor + Tor 0 O74, }") 
y year 


lA 


n 
Y (5) mgr be tar? 6n) 


ae 


ie ) max E? (Tj) m max E*(T5, Jy, 


Let v = 2K* Jog(R/r) and u = 2K. Jog(K /r). Thus, by (3.1) and (3.7), there exists 
a universal constant c3 < oo such that for all x € Li, j 


max E? (z^) < max E" (T34)l Tog"! 
y yeðr 


n—1 
(3.20) +2c1 Y ntl Tog vi To! 17771 
j=0 
< v(esu)" (n + 1), 
where we also used (2.3) and (3.8) in the last inequality. Taking 7 = 6/6 > 0, with 
our choice of r and R, it thus follows by (3.6) that for o = cuv and all 9 > 0, 


max max E"(e*?'j«1 — 0 min min (rt) 
x ye8aD(x,r) yea D(x,r) 
0? 
Li E) 
(3.21) vod mn (^) 
€1—6(-— nv + pe? 


< exp(p6* — 0 (1 — n)v). 


Since tr > 0, using Chebyshev's inequality we bound the left-hand side of (3.18) 
by 


ad * 
p^ 2 7) < (1 ue son) < ef O30) UN xo (578 x p 


3.22) V 


N 
gain, o-n) max Be) | , 
y€8D(x,r) 


ente 


where the last inequality follows by the strong Markov property of X; at (jj. 
Combining (3.21) and (3.22) for 0 = 6v/(6p), results in (3.18) with C = 1/(36c4). 

Since t® = Tar, by (3.1) and (3.8) there exist universal constants c5, cg < oo 
such that 


(0) 
max E? (e" Ist) < c6, 
x, 


234 DEMBO, PERES, ROSEN AND ZEITOUNI 


implying that 
5 tT v “1 
pro (o 2. n) = pro ea N) < egg 35) JN. 
EM csu  3csu m 


Thus, the proof of (3.19), in analogy with that of (3.18), comes down to bounding 


N N 

p^ (j) >(1+4 N\< ME —9(1+2n)v EY Or ) 
p» TY > (1+4n)v ) <e e EC (e**) 
Noting that, by (3.20) and (3.6), there exists a universal constant cg « oo such that 
for p = cguv and all 0 < 0 < 1/(2cau), 


OQ 
o" 
max max E(e)<1+6(1 v--max max — EY (c 
X yeaD(x,r) VERE UY x dit» 2 n! D, 


<1+6(1+2n)v+ p0? < exp(0 (1 -4- 2g)v + p9’). 
Taking ôo < 3cg/c3, the proof of (3.19) now follows that of (3.18). O 


We next apply Lemma 3.2 to bound the upper tail of Tg (x), the first hitting time 
of x € Zi. 


LEMMA 3.3. For any ô > 0 we can find c < oo and Ko < œ so that for all 
K > Ko, y > 0 and x, xo € Z, 
(3.23) P? (Tx (x) > y(Klog K)*) xcK 0-977. 

PROOF. Fix ô e (0,89). Set R = K/7 and r = R/logK, noting that 


Lemma 3.2 then applies for all K > Ko and some Kg = Ko(6) < oo. Fixing y > 0 
and such K, let 


u zy (log K)? 
ng Mo DSUD 
Then, 

P* (Tx (x) = y(K log K)”) 
(3.24) nK ne P 
< P” Tka) = 3 0 | +P] Y tO > y(KlogK) ]. 
j=0 Ea" 

It follows from (3.19) that 


nk 
p^ p WZ > y(K log ky) « e € y(log K)*/loglog K 
jz0 


LATE POINTS FOR RANDOM WALKS 235 


for some C’ = C'(8) > 0. Moreover, the first probability in (3.24) is bounded above 
by the probability of not hitting x during nx excursions of SRW in Z?, each start- 
ing at some point in 3 D(x, r) and ending at 0 D(x, R), so that by (2.1) 


p^ (ke) >J 2) < (1 qo 2 


par; log R 


3.25 
(3.25) < g- 023) log(K)ry/2. 


and (3.23) follows. LJ 


We next provide the required upper bounds in Proposition 1.1. Namely, for any 
a € (0,1] and y > 0, we have by Lemma 3.3, that for y/(a) > 6 > 0 small 
enough, 


2-5 SK (x) | 2a): ) 
P(|s e Zk: ces = dala} >K Y 


|x € Le i) > sajz |) 


«ing (lo ez 
m (K log K)2 ^ 





(3.26) 


ccu Jk (x) 
_ g2-9-y P(e ) 
2. (Klogky =!" 


2 
xe 


ROT MET RE UR 
K —oo 


4. Lower bounds for probabilities. Fixing a « 2, we prove in this section 
that for any 6 > 0 there exists no(8) < oo such that 


IK, (x) 2—a-—8 
4.1 P b So A LS >x a ) 21-23, 
p (ly « Kn (Kn log Kn)? 7 an pz Ky = 





for all integers K, = n? (nl)? with n > no and y € 4 = [b, b + 4] for some univer- 
sal b > 10 (determined in Lemma 4.2). Because such K, cover all large enough 
integers, it follows from (4.1) that 


„Jim P( fx ez ome) > 2af || > uem ex 


m’ (mlogm) - 

which in view of (3.26) results with Proposition 1.1. Hereafter, any estimate in- 
volving the fixed sequence K, — n" (n!)? holds uniformly in y € 4 (even if this 
is not stated explicitly). Consequently, we may and shall prove each of our results 
only for this sequence, which already implies that they hold true for all integers 
large enough. 

We start by constructing a subset of the set appearing in (4.1), the probability of 
which is easier to bound below. To this end, let rg = 0 and ry = (k?,k = 1,.... 





236 DEMBO, PERES, ROSEN AND ZEITOUNI 


For any a > 0 set ny = ng(a) = 3ak^logk and for x € Zk, and k =3,...,n, 
let Ry = Ry (a) denote the time until completion of the first ng (a) excursions 
from 8 D(x, ry 1) to 9 D(x, ry). (In the notation of Section 3, if we set R =r, and 
r = rk, then RE = Dik yt.) ForxeZz ,2xIxk-— 1, let NX, = N (a) 
denote the number of excursions from 90 D(x, rj..1) to 0 D(x, rj) until time t; (a). 
Let N k. (=N k, (a) denote the number of visits to x prior to time S; (a). 


Fix p < (2 — a)/2. Writing m^ np if [m — ng] < k, we will say that a point 
xE Zk, is n-successful if 
(4.2) NX, — 0, x Am o Vkzpn,... n — L. 


Note that NY  — 0 is equivalent to the statement Tx, (x) > R}. Hence the next 
lemma relates the notions of n-successful and first hitting times. 


LEMMA 4.1. Let 
8n = {x € Zi : Tk, (x) > I. 
Then, for some c > 0 independent of y and all n > no, 


IK (x) | Sp si 
rae PT a — 2/1 < cn? /logn 
(U lees S 2a[n /logn E e 


PROOF. We have that for some C > 0 and no < oo, both independent of y, all 
n > ng and any x, xo € Li 


P, = P” (Tk, (x) < (a/m — 2/logn)(K, log Kn)’, Tx, (x) > RF) 


3an? logn l , 

« | Y, x x Qa/x — M logn)Kz Gn logn?) eo Creer 
j=0 

where the last inequality is an application of (3.18) with R = rn, F = r41 [so 

log(R/r) = 3logn] and ô = x /(2a logn). To complete the proof of the lemma, 

sum over x € Zt. and let c « C/2 be such that cle en >1. H 


For any x € Zk, let Y (n, x) be the indicator random variable for the event {x is 
n-successful}. In view of Lemma 4.1, we have (4.1) as soon as we show that 


(4.3) » >> ¥(n, x)= Kr) -1-—6, 
xELi 
for any 6 > 0, all n sufficiently large and y € J. 


Adopting hereafter the convention that o(1,) terms are uniform in y € 4, the 
key to the proof of (4.3) 1s the next lemma (whose proof is deferred to Section 5). 


LATE POINTS FOR RANDOM WALKS 237 


LEMMA 4.2. Fix p « p! < (2—a)/2 and let I(x, y)  max(k: D(x, ry +1) 
D(y, re + 1) = Ø} An. There exist b > 10 independent of a and p, and qn = 
p; 0n) such that 


(4.4) P(x is n-successful) = (1+ 0(1n))Gn, 


uniformly in y € 4 and x € Sg, := Zt. V D(0, rn). Furthermore, for any € > 0 
we can find C = C(b, £) < oo such that for all n and any x, y € Sx, with p'n < 
l(x,y) «n, 


ate 

(4.5) E(Y (n, x)Y (n, y)) <q 20b cry) (==) ; 

F'l(x,y) 
while for all n and x, y € Sx, with I(x, y) =n, 
(4.6) E(Y (n, x)Y (n, y)) < (1+ 0(1,))47 
Let 
v= X E(Y@,x)¥q@,y)),  £20,L..,n. 
x, y €Sg, L(x, y)£ 
Since, by (4.4), 
e( > Y(n, »)= (1 EO o(1,)) K 2n 2 > Koc 
XESKy, 


by (4.6) and the Paley—Zygmund inequality (see [6], page 8), inequality (4.3) is a 
direct consequence of the bound 


n—1 
(4.7) Y Ve < o(1n) K472 


£=0 
Turning to prove (4.7), the definition of /(x, y) implies that 
d(x, y) < 2(ri(,y)41 + 1), 


and there are on Zt. at most Cor? 4.1 points y in the disc of radius 2(rg..1 + 1) 
centered at x, where in the sequel we let C,, denote generic finite constants that 
are independent of n. Since 2p' < 2 — a, 


pn—1 
(4.8) 2. Ve < »- ECY (n, x)) < Cid s Kiran S O(n) Kd; 
=0 x ez "dO Y) 2r ot 


Choose £ > 0 such that 2 — a — e > 0 and fix £ € [o/n, n). Then, by (4.5), we have 
that 


53. 29 p ef fn ks 
Ve < Co Korg, 1q5n C (=) . 


238 DEMBO, PERES, ROSEN AND ZEITOUNI 


Consequently, 
n=l p, \ até 
> Ve < CoK;d, n? Y Ce Cri (2) 
f=p'n £—p!n re 
- n—l re 2—a—& 
(4.9) < Coq? K n? n? t6 > c-t(2 
P F 
ł=p'n H 


26; 7 K4n —2 Y Cir Le 
j=l 
Combining (4.8) and (4.9) we establish (4.7), and hence complete the proof of (4.3) 
and thus of (4.1). 


S. First and second moment estimates. For y € Zk, andn >l > 3 let $1 de- 
note the c -algebra generated by the excursions of the random walk from 3 D(y, rj) 
to 9 D(y, rj.) as defined in Lemma 2.4 (for R’ = r; and R = r1). We start with 
the following corollary of Lemma 2.4 which plays a crucial role in the proof of 
Lemma 4.2. 


COROLLARY 5.1. Let I; = {N? nk = Mk; k —0,2,...,1 — 1}. Then, uniformly 


overalln > l > no, y Ed mis, (my :k —0,2,...,0l — 1l y € Zk, and xo, x1 € 
Ze \ Diy, 11), 


(5.1) P(N? , = mi, 97) = (12-007! 00g)5))P? TN Ns = ml? =m} 


PROOF. For j=1,2,... and k —2,...,1 — 1, let Z} denote the number of 
excursions from dD(y,rg—1) to dD(y, ry) by the random walk during the time 
interval [t;,  ;]. Similarly, let Z denote the number of visits to y during this time 
interval. Clearly, the event 


m; . 
-|$ zi =mik=0,2..1-1] 


j=l 
belongs to the o-algebra JC? (mj) corresponding to r = rj.) in Lemma 2.4. It is 
easy to verify that — at any x ¢ D(y, ri), when the event (N? p=m}eE $7 


occurs, it implies that N d" = we ZI fork 20,2,...,1 — 1. Thus, 


(5.2) P? Tg? Mop =m) = P^*(A|fj) lu? 


pmi pn y 


For some universal constant ng < oo and all / > no the conditions of Lemma 2.4 
apply for our choice of R’ = rj, R =r; and r =r,~2 with (r/ R)log(R/r) < 


LATE POINTS FOR RANDOM WALKS 239 


417? log!. With m;/(I? logl) bounded above, by (2.18) we have, uniformly in y € 
Zk, and X0; X1 € Lie X D(y, rl), 


(5.3) P*(A|G)) = (1 + O^! dogl)*))P*! (A). 
Hence, 
POTR U2 my = (1+ 007 Gogl)”))P™ Cd =m) 
Taking xo = x; and averaging, one has 
P^ (PN? , =mi) = (1+ O(L7 ogi)?))P*!(A) 
= (1+ OU~! logJ)^?)) P" (Alg), 


where the second equality is due to (5.3). Using that (N7 ; =m} C 9j, (5.2) 
and (5.4) imply (5.1). LJ 


(5.4) 


PROOF OF LEMMA 4.2. We start by proving the first moment estimate (4.4). 
To this end, let m = (mpg, Mpn+1,---,Mn) be a candidate value of N7,, k = 
pn, ...,h, and set |m] = 25» j= pn I j — 1. Let Jt, (m) be the collection of maps 
("histories"), 

$:(1,,2,...,|[m|]) e (on — 1, on, ...,n] 


such that s(1) =n — 1,s(]ml) =n, |sGj + 1) — s(j)| = 1 and the number of up- 
crossings from £ — 1 to £ 


u(£) = {Gj + Dis), sG +D) =C- 1, 0} = me. 


The number of ways to partition the u (£) up-crossings from £ — 1 to £ before 
and among the u(£ + 1) up-crossings from £ to £ + 1 is 


prt i ia — D 


Since the mapping s is in one-to-one correspondence with the relative order of all 
its up-crossings, 


i663) = H (mer + me — " 


nm 
£—pn £ 


To each path w of the random walk X. we assign a “history” h(@) as follows. Let 
t (1) be the time of the first visit to 9 D(x, r,..1), and define c (2), 1 (3), ... to be the 
successive hitting times of different elements of (9 D(x, rpg 1), ...,9D(x, ra)). 
If y e 9D(x, rg) for some k, let (y) = k and set h(w)(j) = P(w(t(J))). See 
Figure 1. 

Let A), be the first k coordinates of the sequence h. Let pe = log(re+1/re)/ 
log(re+1/re—i) and qe = log(re/re—1)/logre. Note that log(d(y, x)/r) = 1+ 


240 DEMBO, PERES, ROSEN AND ZEITOUNI 





FIG. 1. A path with “history” h(w) = (4,3,2,3,2, 1, 2, 3, 4, 5). 


O(r~') for any r, uniformly in x and y € 8D(x,r). So, applying the Markov 
property successively at the times c (1), 1(2), ..., v ([m| — 1) and relying on (2.4) 
except for up-crossings from on — 1 to en, for which (2.1) applies, or for down- 
crossings from n to n — 1, which occur with probability 1, we get that uniformly 
for any s € J£, (m) and x € Sx,, 


Phy, =s, Tr, (x) > c(]m])] 


n—i 


= |] (ret Om = pe + OGM 
£—pn 


x (1 — gpn + O((nlogn) ?))"o. 


LATE POINTS FOR RANDOM WALKS 241 


Taking mn = nn, we see that uniformly in x € Sx, and y € J, 
P(x isn-successful)= X` P{hy,, € Han), 7k, (X) > c (1m) 


Img—ne|x£ 
(5.5) f 
= (1 +0(ln))ĝn, 
which is (4.4) for 
| a nadie 
69 a= E a-aw"e [T ("* 2*7 1 ) pra - pore. 
Hi pn, .-yHn—i f=pn t 


Im—nz| <£ 
Since p; = 1/2 — O((£1og £)-!), by the proof of [3], Lemma 7.2, we have that 
; . £ £4-1 : 
uniformly in mg ~ ng, Meyi ^ Rg41 


?/g—3a—]l - 
Ct Z a + me — ‘} pm — po)" 


Jlog£ ~ m, 
(5.7) j 
C£732-1 
Pre CTRPA 
^. J/log£ 


with 0 « C', C « oo independent of £. Further, with qe = £71 + O(1/£1og£) we 
have that uniformly in m pn An pn 

(5.8) A= gon)” es poU, 

Putting (5.6)-(5.8) together we see that gn — Tn aon) with the o(1,,) term inde- 
pendent of y, as claimed. 


Setting Mj :— (I, L4- 1, ..., n — 1) note that the same analysis gives also for any 
l > pn, uniformly in x € Sx, y and m, < kl, 


P(N; , = my,k € Mi) 


n—1 
=(1+o(in)) | Í pM b H ') Py ( — py)", 
k=l 


(5.9) 


Recall that nz (a) = 3ak?log k and that we write N ^ ng if |N — ng] < k for 
pn € k €n — 1 and N =0 when k = 0. Relying upon the first moment estimates 
and Corollary 5.1, we next prove the second moment estimates (4.5) and (4.6). To 
this end, fix x, y € Sx, with 2rj4; + 2 > d(x, y) > 2r; +2 for some p’n <1 < 
n — 1. Since 7/42 — rj >> 27141, itis easy to see that D(y, ri) N 3 D(x, rp) = Ø for 
all k zz 1 + 1. Replacing hereafter / by I ^ (n — 3), it follows that for k 41+ 1, 


k X: E 2, the events {N} , x ny) are measurable on the o-algebra $7. With Jj := 
(0, on, ...,1 — 1} and J; := (0, on,...,1,14+3,...,2—1}, we note that 


(x, y are n-successful} C (N7 , E) njk,kelyn (N; x ^ nk, k € Jit}. 


242 DEMBO, PERES, ROSEN AND ZEITOUNI 


Applying (5.1), we have that for some universal constant C3 < oo, 
P(x and y are n-successful) 
€) EP), Ang ke JN > =m, $1); NX pwn k € f] 
(5.10) mini 
< C3P(N Am, ke Ii 2, P(N; S np k € JIN 2 = m1). 
m~n 
Using Corollary 5.1 once more, we have that 
(1 T O(1n))Gn 
= P(y is — 
= Y EPN? , A ng, k € JIN? , =m), 9); 


mini 
y e y & 
(5.11) N? =m, N? „~ np k € Mii] 
> C4 > P(N ? =mi, NS ng, k € Mia) 
m~n 
x P(N ? Ang ke JN » =m), 


for some universal constant C4 > 0. Hence by (5.9) and (5.7), for some universal 
constant C5 « oo, 


(5.12) 2 P(N? , ~ ng k € JN. tomo < e(Te oak Jae 


k=l 


mi An 


Similarly, using Corollary 5.1, 
P(N? , & ny, k € Ij) 
k k 
(5.13) < X EPPS, ~ n k € JN. = mi, $1); Ns o nk k € Misa] 
l T. 
mj^-ni 


< CgP(N ž o X np, k € M143) 2 P(N; ripe Ji N nf = M1). 
mony : 


Comparing (5.13) and (5.11), and applying once more (5.9) and (5.7), we get that 


1+2 
(5.14) P(N: k Š nk, kel)sz e(l P fiogk as 


k=l 


LATE POINTS FOR RANDOM WALKS 243 


Putting (5.10), (5.12) and (5.14) together proves (4.5). 
In case d(x, y) > 2(ry + 1), the event (x is n-successful} is $7 measurable, 
hence 


P(x and y are n-successful) 


= E({P(y is n-successful|$2)]; x is n-successful) 
= E({P(N? , ~ ng, k € InIN? „ = nn, $2)); x is n-successful), 
and (4.6) follows from Corollary 5.1. O 


6. Large deviation bounds. This section provides crucial large deviations es- 
timates that are key to the proofs of Theorem 1.2 and of the upper bounds in Theo- 
rems 1.3 and 1.4. Roughly, we will be providing precise decay rates for the events 
that certain normalized excursion counts of balls concentric to a point z (excur- 
sions between levels rgy,—j and rgn, before making nn excursions between levels 
Fn—1 and rp) are atypical, together with forcing one or two points nearby not to be 
visited during these excursions. 

More precisely, fix 0 < B < 1 and n > n. Recall the definition Fjg(y) = 
(1 — yBY/( — B) + hy?B of (1.10). For any h > 0, the unique global min- 
imum of Fy, g(yv) is at yh = yn (P) = 1/(h(1 — B) + B). For 0 < a < 2, with 
Ni y = Na (0) and A = Rp (a) as in Section 4, we establish large deviations 
bounds away from yp for the random variables Nx Bn (a) i= nj £n (a)/ngn(a) to- 
gether with the events (7x. (x) > RZ(a)} and (7x. (x^) > A£(a)) for A > n and 
x, x’ not too far from z € Z^ -> that is, (z, x) and (z, x, x’) belonging to the sets 


(6.1) Go(fi) = [(z, x) :z € Zg, x € D(z, rgn-2) N Zi. }» 
Go" (ñ) = {(z, x, x’) : (z, x) € Golf), 


(z, x^) € Go(i), x’ € D(x, rgnn/2-3)}, 


where h € (0, 2). To express the bounds, define 


(6.2) 


[.y^ | Y < Yh, 
In (y) = { [0, o0), Y = Yh, 
[/4,00) y yn. 


LEMMA 6.1. FixingO0 <h «2anda, y,6 >0,foralln > n > no we have the 
bounds 


(6.3) max P(R 5, (a) e Io(y)) S Ka ^ ^^, 
zéZ, 


Ky 


244 DEMBO, PERES, ROSEN AND ZEITOUNI 


PSK; Ri , Ne el 
(1,3) Go() (Tki) > Ral), Na 5, (a) € L(y) 


(6.4) 
< k, Pes 
max — P(Tx;(x’) > Rila), Tk, (x) > Kt (a), Nz o, (a) € In(y)) 
(z,x,x")€Go" (ñ) 


(6.5) n ; 
< K,’ h, BOT 


As is often the case with large deviation statements, the key to the proof of 
Lemma 6.1 lies in the evaluation of certain moment generating functions. To state 
these, fix z € Z7. , and abbreviate 0; for 0 D(z, ry). Consider a path of the simple 
random walk starting at a fixed y € 0,1. Let Z denote the number of excursions 
of the path from dgn—1 to dg, until Ta, and A(x) = (73, < Tx}. See Figure 2. 





F1G.2. Z4. 


LATE POINTS FOR RANDOM WALKS 245 
Let A; — 1/(1 — B) - h/B for0 <h « 2. 


LEMMA 6.2. Uniformly in (z, x, x’) € Go" (fi), A > n and y € 9D(z, rai), 


À c(0, A) 
1-d- $i) nlogn i 





(6.6) E" (e Z/^) <1 4 -( 


for some c(0, à) < œ and all X < A$, 


pr—1 ) c(1, A) 
B—-0—8)08—1/ nlogn’ 


for some c(1, X) < oo and all à « A7, and 


1 
(6.7) E? (e^ 7/"1,4) «l4 -( 


(e Z/^4 | dan) <1 "(zu iz) ia Po 
68 Bie alae) ST* , Us 70. 5)05—4)/ * nlogn 
for some c(h, X) < oo and all X < A7. 


REMARK. The bound (6.8) is an improvement over (6.7), and will be used, 
only in the region h > 1 (in fact, h near 2). 


PROOF OF LEMMA 6.2. Recall that by (2.4), for some c1 < co, all n > no and 
any z: 


(6.9) q- = min P" (Ta, = Tog -1) < max P" (Ta, < Tapas) = d+: 


ve Bn U€Ofn 


(6.10) q- S an P" (Tiji < To, ) < ens. P" CEPE < T5, ) S44; 


where q+ = (1 — B)~'n7!(1 + c1/logn). By (6.10), for any y € dn—1, 

(6.11) P*(Z =0) = P? (Tani > Ts, ) «1-—4., 

and for j = 1,2,... we have Z = j if we first visit 0g, 1 prior to ða, then have 
exactly j — 1 cycles consisting of visits to 0g, and back to 93g,..1, prior to the 
first visit to 94. Hence, by (6.9), (6.10) and the strong Markov property, for any 
y € 064.1 we have that P? (Z = j) x (1 — q_)/~1!q%.. The bound (6.6) then follows 
from the h = 0 case of the inequality 


OO 
(1—g_)+ 3 e) pi A — a "G4. 


z 
(6.12) 
«il Br—h ) c(h, X) 
T  nNB—(0—8)0B—h)/ nlogn’ 
where in general 


h 


Re) and A«2;. 


Ph = 


246 DEMBO, PERES, ROSEN AND ZEITOUNI 


To see (6.12) let v = 1/(1 — B) + h/B — A. Then, for some finite C and no 
(both depending on c', c1, 4, h and £), we have that qi (1 — pr)" < (n(1 — 
8)) ^ + C/logn) and 1 — e*/"(1 — pj)(1 — g_) > n™tu(1 — C/logn), for all 
n > no. Consequently, for some c = c(h, A) « oo and all n > no, 


oO 
0 —4-) - Yo eiA — pa —q 7192 
j=l 


21 . A/n 
q4(l— paye 
=(1-—g_ 
(1—4 ) +7 


— eN — py)(1— q—) 
dq. -( 1 1 ) m C 
T  nN(ü-BYyv 1—8) nlogn 


] Bà —h C 
Vv aA 
nNB-Fh(1—B)—B(0 —B)47  nlogn 
which gives (6.12). 
We next turn to (6.7). Enlarging c; as needed, by (2.1) we have that for all 
ñ >n > ng and (z, x) € Go(n), 


min P"(T, < Top) > 
vEdgn-~1 


mi P"(T, <T, 
OON (Tx < Ta D(x,0.5rg,)) 


- i (1 uh x) —p 

— Bn logn/ * 
We have Z14(x) = j = 1 if we first visit 0g, 1 prior to ðn, then have j — 1 cycles 
consisting of visits to 0g, and back to 0g,—1 without hitting x or 9,, and finally, 
a visit to 0, without hitting x. Hence, by (6.9), (6.10), (6.13) and the strong Markov 
property, for any z, y and x as above, 


(6.13) 


(6.14) P'(Z—j, A(x)) x (1 — p 1 — 42) ai. 
Note that A(x) occurs when Z = 0, so that (6.11), (6.14) and the h = 1 case of 
(6.12) give (6.7). 


We finally turn to (6.8). By the strong Markov property at min(7;, Ty), for 
v € 0g, —; and x, x’ e D(z, rpn—2); 


P" (max(T;, Ty’) < Togn) 
(6.15) <P'(Ty < Top, )P* (Tx < Togn) 
+ P'(T, « Tag, )P" (Tx < D): 


Enlarging c; as needed, since log rgnn/2—3/ log rg, = h/2 + O(1/logn), similarly 
to the derivation of (6.13) we have by (2.1) that for all n > no 


LATE POINTS FOR RANDOM WALKS 247 
and (z, x, x^) e Go" (ñ), 
P* (Ty < 1524) max P" (T, < Tapn) 
vEðgn—1 


ax — P"(Ty < Tap(x2rg,)) 


<P (TT. « T. oe m 
T ( x aD(x 3 rga)) d(v,x)>0.5rgn—1 


~( h Ci ) 
[Ies 
pn 2 logn 


The same bound applies to the other term on the right-hand side of (6.15). When 
combined with (6.13) which applies for both x and x', these bounds yield (by 
inclusion-exclusion) that for all n > no, uniformly in (z, x, x") e Go (A), 


max p” (Tx > Tagn» Ty > Tan) 
VEIBn—} 





(6.16) 
<]-2p+ : (1 ar 2 
7 i Bn 2 logn 





)=:1- 


with pj = A —c'/ logn) of the same form as ph. Note that ZlaA()lAq» zy] 
if the walk visits 0gn—1 prior to ðn, then has j — 1 cycles consisting of visits to 
0g, and back to dgn—1, without hitting x, x’ or 0, and finally, visits to a, without 
hitting x or x’. Hence, by (6.9), (6.10), (6.16) and the strong Markov property, 
for any z, y, x and x' as above, 


P (Z = j, A(x), Ax) x Q— Pa (1 — 42) la, 


and (6.8) now follows as in the derivation of (6.7). This completes the proof of 
Lemma 6.2. 1] 


PROOF OF LEMMA 6.1. A straightforward calculation shows that for any 
h 7 0 and y » 0, 
BAhy — h 
B—(1— B)AnyB — h) 
_Bth—-p)—-I/y 


where Aj, y :— ——————————-— < hj, 


BU — B) 


Fy, p(y) = Any y ^ B? — 
(6.17) 


and Aj, , <0 if and only if y < yp. 

Let Zo denote the number of excursions from 8g, to 0g, before X, first hits 
0,..1 and let Ag(x) denote the event that x is not visited during this time interval. 
For any j > 1 let Z; denote the number of excursions from ðgn—ı to 9g, during 
the jth excursion of X; from 94.1 to 9, and let A; (x) denote the event that x is 
not visited during this excursion. With this notation, 


3an? logn 


Nr pn) — p» Z;, 
j=0 


248 DEMBO, PERES, ROSEN AND ZEITOUNI 


and the event {7x,(x) > RZ(a)} is the intersection of the events A;(x) for 
j =0,...,3an”logn. Consequently, using Chebyshev's inequality and the strong 
Markov property (at the start of the 3an? log n excursions from 3n—ı to 8,), for any 
ô > 0> Aandall A > n > no, uniformly in z, 


P(R: gn (a) < y?) < e^" eIn eè E] 21) 


(6.18) )" logn 


< ala | max EY (e*4/") 
y€0,—, 
Per y < yo consider (6.18) for A = Ag, < 0, applying (6.17) and (6.6) to ob- 
tain (6.3) in case y < yo. Turning to deal with y > yo, note that P”(Z = j) < 
(1—9.)/71q, for all j > 1, even if y € 0gn—1. Thus, for any A < AG, similar to 
the derivation of (6.6) we get that for some cs = c5(A) < oo and all 4 > n > ng, 
uniformly in z, 


(6.19) E(e*Z0/") < max E?(e^Z/") < cs. 
yEOgn—1 


Analogous to (6.18) we also have that for any ô > 0,4 > 0, A > n > no and z, 
P(E 5, (a) = y?) x eV mI (o Eja En) 


(6.20) An 
< aR P ( max E? (e^7/ ")) | 
YEOn—1 
Considering (6.20) for A = Ao, = 0 (as y = yo), and applying (6.6) and (6.17), we 
complete the proof of (6.3). 
Similarly, we have that for any 6 > 0 > å, A > n > no, (z, x) e Go(n), 


P(7x, (x) > Rž (a), NZ 5, (a) x y^) 


s i 
(6.21) grt "(Ti CES ^ m 

jx] 

223 : 3an? logn 
«Doy " max E (9 1,46)) 

y€O,-i 
Given y < yı, consider (6.21) for A = Aj, < 0, and apply (6.17) and (6.7) to 
get (6.4) for y < yı. Further, the same argument leading to (6.19) shows also that 
max cas, ., E? (e^ /" 1405) < cs for all 4 < A1. Consequently, for 6 > 0, A > 0, 
ñ >n > no and (z, x) € Go(A), 


P(7x; (x) > RZ (a), NZ 5, (a) = y?) 


An 
n— 


LATE POINTS FOR RANDOM WALKS 249 


and since A; , > 0 for y > yı, we complete the proof of (6.4) by using again (6.17) 
and (6.7). 

Using (6.8) and A = Ay, ,, the proof of (6.5) proceeds along the same lines, thus 
completing the proof of Lemma 6.1. UO 


7. Late points in a small neighborhood. We devote this section to the proof 
of Theorem 1.2, as the basic large deviations bounds needed are already in place. 


PROOF OF THEOREM 1.2. We actually show that for 0 < œ < f^ < 1, some 

b < oo, any $, ô, n > 0, and all n > no, y € £ and x =x, € Zi, 

(L.1  P(|£x, (0) N D(x, rpn—»)| = K2P- Q9 9/8143) < 2g, 

(7.2) P(|£x, (à) Y D(x, rgn+b)| = KETONE) > 1 — 2y. 

Since log rgn+p/log Kn — B and the set of K, values cover all large integers, the 
theorem follows by considering 7 |, 0 and adjusting the values of B, ô > 0 and 
E>0. 

Starting with the upper bound (7.1), recall the notation R;(a) for the time 
until completion of the first ng(a) = 3ak*logk excursions from 8D(x, ry 1) to 
0D(x,ry), k —3,...,n, then Ny o(a) for the number of visits to x until time 
R (a), and Ny (a), 2 <1 € k — 1, for the number of excursions from 9 D(x, rj—1) 
to 9 D(x, rj) until time Ry (a). Let tf = * (Ks log K,)* and 


(7.3) £x, (à) := b € Zz :Tx,(y) > max za), 
ZE Kn 


taking hereafter & € (0, 2a) and a = 2a — & > 0 (in the remainder of the paper 
we always have à < 2a < a). Applying (3.19) with R = Fp, r = Fp—1 and N = 
3an logn, we see that for some c = c(a, £) > O and all n, 


Z(% V* —1_—cn* logn 
(7.4) mex P(R2 (4) > at*)<c e en 
Kn 


resulting with 
(7.5) im P(L£x,(@) € £x, @))=1. 
Hence, to establish (7.1) it suffices to show that 
(7.6) P(|-Éx, (à) N D(x, rgn-2)| = Kp? 9/8 99) < n. 
Since Fo g(y) > 0 for y < yo = 1/8, it follows from (6.3) that for any ô’ > 0, 
(7.7) lim max P(N} g,(à) < (1 — 6)n,(a)) = 0. 
Kn 


n— o9 x 7 


250 DEMBO, PERES, ROSEN AND ZEITOUNI 


Recall that F1 g(1/B) = 1/8 and rg, < Kf for all n. Moreover, (1 — 8’)n, > 


y*ngn for y = (1 — 5^)/ and all n. Hence, if y > y;, then by (6.4) we have that 
P(|-£x,(@) N D(x, rgn-2)| = K20 9/8199. NX X. (3) > (1 — 0^)ns (4) 
—(QB—àa[p)—48 2 
~~ =m Be yeaa ina) C R60) > S8, Mg G0 =v") 


< KËTA- F,g(y»- m 


With B < 1, for ó' > 0 small enough we have both y > y; = 1 and Fie(3) — 


Fi.g(y) < 6. Thus, considering (7.7) and (7.8) for such 5’ completes the proof 
of (7.6), hence also of (7.1). 

Turning to prove the lower bound (7.2), fixing 0 « & « 2(6* — à) so a! = (2a+ 
&)/B* « 2 and 0 < p < (2—a')/2 we say that a point y € Z% is Bn-successful if 


k 
N5, o(a") = 0, Nin (a) ^ hi (a) Vk — pfn,..., Bn — 1. 


In particular, if y is Bn-successful, then Tx, (y) > Ron (1^). Let fae , (4^ Bn) be 


the set of points in Zt which are Bn-successful. A rerun of the proof of (4.3), 
this time with Bn replacing n, shows that for some b > 10, any ô > 0, 7 > 0, all 
n > no, y e £ and x € Zk,» 


(7.9) P(| Lig (a^, Bn) N D(x, rgns9)| > KLEO) > 1— . 
Consequently, (7.2) follows once we show that uniformly in x, 
7.10 P in R} (a) < it) 0. 
( ED Pp) pula ) Sat, ) > 
To see this, let Y, be a minimal set of points in D(x, rgn+») such that 
D(x, rpn4b) € |] DO. rgn-2- 
YEYn 


Let Rn (a^) denote the time until completion of the first n gn (a^) excursions from 
9D(y,rgBn-1 + rgn-2) to 9D(y, rgn — rgn—2). For any z € D(y, rgn—2) we have 
that 


D(z, 'Bn—1) C D(y,rgn-1 + ''gn—2) c Dy, Bn — rpn—2) c DE, rBn)s 


implying that each excursion from 0D(z,rgn—1) to 0 D(z,rg,) requires at least 

one excursion from 9 D(y, rgn—1 t 'Bn—2) to 0 D(y, rgn — rgn—2). See Figure 3. 
Thus, R%, (a^) > R? (a) and consequently, 

(7.11) 


P( imi án (4) € att; ) < P($5, (a) x ath). 


LATE POINTS FOR RANDOM WALKS 251 





Fi1G.3. A=dD, "Bn—1); a=dD{y, rBn—1 trBn-2), b —9D(y, Bn —F8n—2), B=dD(z, Bn). 


Applying (3.18) with R — rg, — rpn—2, r — rgn-1 + rga-2 and N = ngn (a^) = 
3(2æ + £)n? log(fin), the right-hand side of (7.11) is bounded by 


R 
C exp| -c( 1) logn} «cl exp{—cn logn}, 


for some C, c > 0 that depend only on a, E > 0, yielding (7.10) (recall that |V,,] < 
Cn". O 


8. Clusters of late points. Fixing 0 < a, < 1, this section is devoted to 
the proof of Theorem 1.3. As usual, it suffices to establish (1.5) and (1.6) for the 
subsequence K, = n (n!)?, provided all our estimates are uniform in y € 4. To 


252 DEMBO, PERES, ROSEN AND ZEITOUNI 


this end, set 


(8.1) W* (Bo, B1) = | {y € LK, (a) :rgan-3 < d(x, y) < rgin-3]]. 
with W* = W* (0, B). We actually prove that: 


LEMMA 8.1. For each ô > 0 there exists € € (0, 8/2) such that 
(82)  p,:— Kete max P(x € £x, (a), W* < K;P0 7907?) — 0. 


xeZy H--> OO 


LEMMA 8.2. For each 6 > 0 there exists € € (0, 6/2) such that 
(8.3)  p,:— Kor max P(x € Lg, (œ), W* > K POTR —> 0. 
x 


n> O© 
ez 


By (1.2), we have P(|£x,(a)| > Kz 9 */^) —> 1 for n — oo, and with 
log rgn—3/log Kn — B, the bounds (8.2), (8.3) imply that (1.6) holds (adjusting 
P as needed). These bounds also imply that (1.5) is a consequence of the uniform 
lower bound P(x € £x,(a@)) > Kn aod 2: holding for any n large enough and all 
x € Z% , x #0. Applying Lemma 4.1 we get the latter bound as soon as 


(8.4) min P(Jx,(x) > R? (a)) > Ke? 
xez M0) 


holds for a = 2a + ¢/7 and all n sufficiently large. Since 7x, (x) > R* (a) when- 
ever x is n-successful, by (4.4) and translation invariance of the SRW we have 
that 


8.5 min min P?(Tp (x) > &*(a)) > K722 8/6, 
( ) xez, y£D(x,ra) ( Kn ( ) nt ))> n 


For any finite r > 0 there exists c = c(r) > 0 such that P(T; > Tsp(x,)) = c for 
all n sufficiently large and all x Æ 0. Consequently, by (2.1) we have that P(T} > 
T3p(x,r,)) = c'/logrn > Kn / for some c' > 0, all n sufficiently large and all x + 
0. Combining this-with (8.5) and the strong Markov property at Tj p,x,r,) results 
with (8.4), thus completing the proof of Theorem 1.3. 


PROOF OF LEMMA 8.1. Let Z* g := {z € Z5, 9:2 0, d(x, z) < 0.57 pn-3}, 
where du g’ denotes for each 0 < £' < 1 a subgrid of Zt. of spacing 4rg/,..4 such 


that 0 € Z, p. Fixing £ € (0, 2œ) and rj € (0, 1) to be chosen later, let a = 2o +£, 
a’ = (1 4- 25)?a and 


W* = |{y € DG. rgn-e) : Tk, O) > £5, 4(a)]]. 


Fixing x € Zt. let C := (W* < K7 (ieman and for any z € Z define the 


events Az := {RZ (a) > att}, Bz := {N} e, 4(a) S nga-4(a^)) and C; = {Wi < 


LATE POINTS FOR RANDOM WALKS 253 


Roe), Observe that A; N B; implies that W* > W* and hence A, N B; N 
C € C; for any z € Z}, g. Further, setting à = 2o — & and considering the events 
FE, :— (7x, (x) > RZ(@)} and H, :— {RZ (a) = att}, we have that if x € Lg, (a), 
then H, UF, holds for each z € Z* n,p Note that by the preceding A,NF,NCC 

(F; N B?) UC; for each z € 2 g and hence 


(x € £x, (2), C] cL JR; JAzU (Na. NFN o) 
£ £ z . 


c | JE; JAzu (m c. (JŒ: n B5. 
Z £ Z Z 
With |Z* ,| < K§ for all e > 0 and n sufficiently large, we thus have that 


Ph anre max RZ (à) > at 4 + aen min R (a) x at E 


ze Kn zez? Kn 


+ K22+© max P( max W < gem 
xeZzz zeŽ* B 
+ KITE max P(Tk, œ) > RZ), N£ g,_4(a) > npn—a(a')) 
x EA: JB 
— Pn,0 + Pn,1 + Pn,2 + Pn,3- 
By (7.4) we have that p,,o — 0 as n — oo. With a > 2a, by (3.18), similar to 


the derivation of (7.4) we get also that p,,; — 0 as n — oo. 


Turning to deal with the term pn,2, consider the o-algebra $ = (Y, eg x 9*, 


for $7 corresponding to R’ = rgn-4 and R = n5 in Lemma 2.4. Since 
D(z’, rpn—4) € D(z, rn-1)\ DC, rpn—4) for any z, z € Z p it follows that condi- 
tional upon $, the random variables (W^) zer , are independent with W^ measur- 


able on the o-algebra H? (ng,..4(a^)) corresponding to r = rgn—6 in Lemma 2.4. 
With |Z s | > n? for all n sufficiently large, it follows from the latter lemma that 


Pn = Ks max e( I] P(W* < < aa) 


xeZy eg: P 
2 


n 
zéZy 


n— oo 


provided that for some universal constant c > 0 


(8.6) min P(W* > k280-09 > c, 


zeit Kn 


254 DEMBO, PERES, ROSEN AND ZEITOUNI 


Applying (3.19) for R =rgn—4, r = rgn—s and N = ngs-4(a), we have that for 
a’ = (14- 25)a' B?/2 and n large enough, 

Z / ly 
(8.7) ly P(5, (a )>a moo 0. 
Further, if 5, ,(a^) < ott, then W? > |.£x, (o7) N D(z, rpn—)|- Thus, taking 
n > 0 and £ > 0 small enough for a’ < 8? and 28(1 — a) — 46 < (2B — 2o//), 
we get (8.6) by combining (8.7) and Theorem 1.2. 

It thus remains only to show that p, 3 — O in order to complete the proof of the 
lemma. To this end, let a = (1 + 25)a, recall the set Go(n) [taking n =n in (6.1)] 
and note that 

2a--2€ ~\ TAT ~ 
pua SKa mex o P(7x, (x) > RKE), IN, pn@ — 11 =n) 
+ KRT max P(NZ 4 (4) <1 +n, NZ g(a) > 1+2n) 
zez? i i 


€ Kn 


pee max P(N, pn—4(@) > (1 + 2n)”) 
zezy 


= Dn (B) + Pn,4 Pn,s; 
where 


Nin bn—4(@) = Nén,pn—4(4)/pn—4(@) 
and the bound above (and in particular the last term p, 5) follows from the inclu- 
sion 


{NE o, 4 (2) > ngn—4(a’), NE g(a) < (1+ 2n)} C (NS, 5, 4 (8) > (1 + 20)?), 


which is obtained by unraveling the definitions. 
Since à = 2a — & and y, (f^) = 1, we have by (6.4) that for any 8’ €e [B(1 — 


a), B], 
20—(2a—€) Fy y (y)d-8 
(88 . Pn(B') < sup Re — 0, 
B'e[£ (1—o), b] Iy?—17n T 

for € = e(a,B,n) and & = &(a, B, n) sufficiently small, using the fact that 
(y, B) e Fig (y) is continuous and F; g/(y) > Fip) = 1 for y £1. 

By the strong Markov property of the simple random walk at R2(a) and the 
bound of (6.3) at y = ((1 + 2n)a — (1 + 5)a)/(a — à), we have that 
KTE ^ max — P'(N,(a—à)zy) 


Pn4 € 
zeZe ye D(zra) 


(8.9) - 
2a--3e—(a—a) Fo p(y) 


< Kn Md 0, 


n-—oo 
for £ = (œ, B, n) small enough, since y — oo and (a — à) Fo,g(y) = 24 Fo,p (3Ẹ + 
13-9) — œas | O. 


LATE POINTS FOR RANDOM WALKS 255 


'We complete the proof of the lemma by showing that p, 5 = Oe). To 
this end, first note that by (2.4), the probability that the number of excursions 
from 0D(z,rgn—s) to 0 D(z, rgn—4) until time T3D(z,rga) exceeds 2nngs..4(a) is 


bounded for large n and all z by (9/ 10)" Bn—4@) — O (e ?*), Hence, using the 
strong Markov property at T3 D(z,rgn) and translation invariance of the simple ran- 


dom walk, it suffices to show that P'(NO. n40) > 129) = O (e7? ), uni- 
formly in x € 0 D(0, rgn). Let P, denote probabilities with respect to the random 
walk in Zi, K, Then, uniformly in x € 0 D(0, rgn), by conditioning on the o-algebra 


69 of excursions from 3D(0, rg.) to 8 D(0, rg, 1) and twice using Lemma 2.4 
[for r =rgn—5, m = ng, (a), first with K = Kpn and then with K = Kgn], we see 
that 


P> (N5, ps4 (2) >1+ 2n) 
= (1 + 00,)) Pj (RB, sa) > 1 +21). 
Then, for & = (1 + 7)a/2, uniformly in x as above P5, R$, (5) > Athn) = 
O (e 7") by (3.19) and P5, (89, (1 + 2n)a) < at) = O(e"”) by (3.18). 
So, the right-hand probability in (8.10) which can be rewritten as P$, (Ron (a) > 
R? _,((1 + 2n)a)) is uniformly in x at most O(e7?*). O 


(8.10) 


PROOF OF LEMMA 8.2. With Zn ,p' as in the proof of Lemma 8.1, let zg (x) 
denote the point in pA ,p' Closest to x, and Zg y = {z € Z, BU :q4N? pin (a) —1| <n}. 
Taking h < 2, to be chosen below, set B; = f (h/2) f for j —0,1,... and let £ 
be the smallest integer so that Bg < B(1 — œ). Let W*(.,-) be as in n (8. 1), but 
with the set L K,(@) of (7.3) instead of Lg, (o). Note that if Ri(a) < at; for 
all z € Z2 Ky? then £x (a) € L K, (@) and W*(.,-) < W (^, -). Also, automatically 


W*(0, Be) < K^. so for all n sufficiently large the event W* > K2£0- 0*5? 
implies that W*(B;..1, Bj) = Kn Pia) for some j —0,...,£ — 1. Thus, we 


bound the event {x € Lg, (a), W^ > Kn RO. ur in the definition of p, by the 


union of the events {R} (à) > att for some z} and (x € L K, (à), W* (Bio Bj) = > 
2B; (1—o)4-48 


Kn } for j —0,...,£— 1. Splitting the latter events according to whether 
z8,(x) € Z;g,,, or not, we get that 
£-1 £—1 
Pn € Pno t `» Pn,j + » Pn (Bj), 
j=0 j=0 


where 


pij KU max P(x € £x, (à), zp, (x) € Zin, 


xeZ 


PN 2B, (1—a)--45 
W*(6,41, Bp x Ke). 


256 DEMBO, PERES, ROSEN AND ZEITOUNI 


By (7.4) we know that p,,o — 0 and by (8.8) also p, (Bj) > O for j —0,...,£— 1. 
Turning to deal with p,,;, let D,,;(x) denote the annulus D(x, rgjn—3) \ 
D(x, rgj, n3). Since, for any w > 0 and x € Zk,» 


P(x € £x, (à), zg, (x) € Zg;.s, W" (Bja1, Bj) = w) 


«xw! » P(x,y€ ÊK, (a), zg,(x) € 5,5), 
y€Dg j) 


while logrg;,—3/log Kn — Bj and y4(Bj) X J/1— n, which we may assume by 
taking 7 sufficiently small, it follows from (6.5) that for all n large enough 
2a(1-4-Bj)—26 


ni < K max P(x,y e€ Lg (a), zg, (x) € Zg, 
Pn,j € Kn "um (x,y € Lx, (4), zp, (x) € ZB, n) 


x tus OH )-àF, g (/ 17 2-5 
Bü -a)xB'zp 
Then, p,,; — 0 as n — oo for n, & sufficiently small and A < 2 sufficiently close 
to 2 using the fact that (y, h, B^) > Fh,g' (y) is continuous and Fy, g (0-14 B. 
Possibly decreasing £ and & for (8.8) to hold we complete the proof of (8.3). 1 





9. Upper bounds for pairs of late points. Recall that Fp g(y) = Corey dn 
hy* B. We begin by showing that 


(9.1) 2428-2 inf Fjg(y) 
y €lo,p 


2 +26 — Aa/(2 — p), if B x 2(1— Ja), 

o (80 - Va) -4(1- Var)’/B,  itgz2(1— Va), 
where Ta, g = {y = 0:2 — 2p — 2a Fo g(y) = 0], thereby establishing the equiva- 
lence of (1.8) and (1.11). Indeed, as noted before, F2,g (y) is quadratic, with mini- 


mum value F5 g(y2) = 2/(2 — B) achieved at y2(B) = 1/(2 — B) < 1. It is easy to 
check that Ig, is the interval [y_, y+] for 


(9.2) y+ = y+ (œ, B) = B^! max(1 a! ? (1 — 8), 0). 


Since yo < 1 < y+ we see that y2 € L',,g if and only if y- < y2, leading to the 
explicit formula 


(9.3) p (a, B) =2+ 2B — 2a Fz, g (max{y-, y2}) 


[where we denote hereafter the left-hand side of (9.1) as o (œ, B)]. Combining this 
with the fact that y. (o, B) > y2(B) is equivalent to B > 2(1 — J/a), we obtain 
the identity (9.1). Clearly, B +> p(«, B) is continuous on (0, 1) and by (9.1) it 
is also monotone increasing in f [for B > 2(1 — /a) by inspection, while for 
B x2(1 — a) we have that do/dp > 1]. 


LATE POINTS FOR RANDOM WALKS 297 


We prove in this section that for any 0 < a, B, 6 < 1, 
(94) im P(l((, y):x, y € £x (0), d(x, y) < KP) z KOOP) =O. 
~> OO 


To this end, let 


(9.5) Wo, Bo, Bin = {(x, y) S X.y € Lk, (a), l'füon—3 < d(x, y) < rBin=3}. 


It suffices (as usual) to prove that (9.4) holds for K, = n* (nD?, uniformly in y € J. 
Further, logrgn—3/ log Kn — £, so fixing 0 < a, B, ó < 1, itis enough to show that 


(9.6) Jim P(IVoos.l 2 KP OB)+49) — o, 


Note that |Wo,o,5ü—-oya4l < KP (ee) Pe (a)| for some universal no = 
no(a, B) < oo and all n > no, while 


p(a, B) z 2+2B — 2a Fz g(1) — 2(1 — a) - 28(1 — aj), 
so that it follows from (1.2) that 


9.7 d a — p (a, B)+46 
( ) : oo (| ,0, 80 o),nl > Ki ) 
š F P P 2(1—a)-4-46 
< mu ( Kn (œ)| > K; ) 0. 


The following lemma will be proven below. 


LEMMA 9.1. We can choose h < 2 sufficiently close to 2 and à < 2a suffi- 
ciently close to 2a such that for any B! € [B(1 — a), B] 


(9.8) ds, p! = P(IVz n, p | = KR? 099) 752,0. 


tS 


äh Bn (x, y):x, y € LK, (B), Tghnj2—3 < d(x, y) < rgn-a]. 


Fix h < 2, à < 2a according to Lemma 9.1. We then set B; = B(h/2)) and £ 
as the smallest integer such that Bg < B(1 — a). By Lemma 9.1 we have that 
Qn,B; > 0, as n — oo for j = 0,...,£ — 1. Combining this with (7.5), the 
monotonicity of f +> o(a@, B) and (9.7), we establish (9.6). 


PROOF OF LEMMA 9.1. Let Dy g(x) denote the annulus D(x, rgn—3) \ 
D(x,rgrhny2—3). Fix 0 < n < 1 to be chosen below, abbreviating y... = y. (o, f), 
yx = (1 — n)y_(a/2, B^) and y, = ys (f^). We will argue separately depending on 
whether or not y, < Yp. Consider first the case where yx < yp. Applying (6.5) at 
y = yh we conclude that for all n large enough, 


FEN —áüF +3 
max P(x, y€x,(à)) < Ke hp dtS 


max 
xeZ yY€D, pr (x 


258 DEMBO, PERES, ROSEN AND ZEITOUNI 
By (9.3) at £', this implies that if y < yp, then 


RS n ~ E gg (,à,h) —6 
qup € K, P2079 O Y Phx, yer, (à)) < Kn? ; 
xeZi yED,, gr (X) 


where gg(n,aà,h) = 20F>,g/(max{y_,y2}) — G@Fh,g(max{ys, yan). (Here 
max(y., yn] = ya.) Note that gg/(0, 2o, 2) = 0 for all ^; hence for any à > 0 
we can and shall take h sufficiently close to 2, a < 2a sufficiently close to 2a 
and 7 > O sufficiently small so that gg/(n, à, h) < 8/2 for all B’ € [B(1 — @), B]. 
Clearly, this choice of parameters guarantees that.q, s bud O whenever y, X Yh. 

Keeping this choice of ^, à and 7, we turn to deal with the case where y, > Yh, 
denoting by zs g’ the subgrid in Lig, of spacing 4rgr,..4. Let zg (x) denote the 
point closest to x in Zin. p' SO 


Ney (x) 


PM © S72) s (. min (5; p, @) 3y2) «s. 


Then, using again (9.3) at £’ and the bound (6.5), now for y = y, > yh, we get 
that 


qn B E nB + K, 9900735 Y^. Y. P(x, ye £g, (à), N, 
xeZy yeD, pl) 


ve (a) > > y2) 


2a F> g:(max{y_, y2})—28 
<B) HK ^ 


x max max P(x, ye LK, (a), Nz n, pn = = vz) 
te. x, ye D(z,rgis 2) 
d(x.y)2T Blan 2-3 


gg (nd, h)— 3 
< qu (B^) + Kn” 
(Here max{yx, Yh} = Yx.) AS we have seen, our choice of parameters guarantees 
that 25/(, à, h) < 6/2. Moreover, since y. < 1 < yo [for any f' e (0, 1)], it fol- 
lows by (6.3) that for any £ > 0 and all n large enough, 

—ã Fo gı (Y+ E 2—2p'—àaF, at (y) +2E 
(9.9) alB) < |n gl Kn ^^ < Ka a | 
Note that for h < 2 we have yn > 1/2. Hence, using our assumption that y > Yh 
and the definition of yx, we have that y. (4/2, p^) > 1/2 > 0. This guarantees that 
Y— (à/2, B") is the lower boundary of (y :2 — 28' — áFg,g(y) = 0]. It follows 
that 2 — 28' — G Fo, g/(y«) < 0 uniformly in 8’ € [B(1 — a), B] for which y, > yn. 
Hence we can find € > 0 so that gn (8^) Eee 0 uniformly in this set of values of f, 
implying in turn that q, pg e 0. This completes the proof of Lemma 9.1. O 


m0 


# 


LATE POINTS FOR RANDOM WALKS 259 


10.. Lower bounds for pairs of late points. Fix 0 <a, 8 < 1. Recall the nota- 
tion K, = n” (nf)? and the sets V^, 0, g,» Of (9.5). We show that if y_(a@, B) « y <1 
and 1 — œ > ô > & > 0 are such that 2 — 26 — (2a + £) Fo, g(y) > 26, then 


(10.1) Jim P(Wa0,6n1>Kn TOOR =, 

uniformly in y € 4. In view of (9.3), taking €, ô | 0 followed by y € (y... (o, B), 1) 

that converges to max(y.. (o, B), yo(8)), we get the lower bound in Theorem 1.4 

for the subsequence K,,. By the uniformity in y this bound extends to all integers. 
Fixing y, ô and & as above, set a = 2a + &, recall the notation rg, ni (a), R; (a) 

and N k, (a) of Section 4 and let 


(B —YP) 
a-yB) 
where a* = a(1 — y8)*/(1 — B)^, so that fi, = n, (a) and fig, = y?nps (a). 

Let Zn C Zy, be a maximal set of points in Zt. \ D(0,r,) which are 
4rgn44 separated, such n (0, 2r,) € Zn. We will say that a point z € Z, is 
(n, B)-qualified if N n x ng for all Bn < k <n — 1 and in addition 


(10.2) ny = 3a" (x — n) logk, pn <k <n, 


W = |{y e Dle, rpn—4): TK, 0) > RE(a)}| > KEC- 
(compare with the definition of n-successful points in Section 4). If 


min R? (a) ot; 
zEZk, 


then 
(Yao, nl> >, V^ > Hz e Zn: zis (n, Mtr m 


Z€ 
Since P(min zez% R? (a) < att) —> 0 as n — œ (see the term p,, in the proof of 
Lemma 8.1), and (1 — B)a* = aFo,g(y), we thus get (10.1) as soon as we show 
that 
(10.3) |. Jim P(I{z € Zn : z is (n, B)-qualified}| > KUA- — 1, 


The following analogue of Lemma 4.2, whose proof is deferred to the end of 
this section, is the key to the proof of (10.3). 


LEMMA 10.1. For any x,y € Zn, let I(x, y) = max(k:D(x,ry. + IN 
D(y, ry +1) = gj ^ n [note that l(x, y) > Bn +4]. Then there exist b > 10 and 
Qn = (n/rgn) ? *o(In) such that 


(10.4) P(z is (n, B)-qualified) = (1 + 0(15))dn. 


260 DEMBO, PERES, ROSEN AND ZEITOUNI 


uniformly in y € 4 and z € Zn. Furthermore, for any € > 0 we can find C = 
C(b, £) < œ such that for all n and any x, y € Zn with L(x, y) <n, 


a* 4-& 
(10.5) P(x, y are both (n, B)-qualified) < geren) , 


FI(x,y) 


while for all n and x, y € Zp with I(x, y) =n, 
(10.6) P(x, y are both (n, B)-qualified) < (14-0(1,));2. 


The proof of (10.3) then proceeds exactly as the proof of (4.3), where the con- 
dition 2 — 28 — a Fo,g(y) > 26 implies that a* < 2 and by (10.4) the expected 


number of (n, B)-qualified points is KEAC) so. with 
V = 2: P(x, y are both (n, 8)-qualified), £ — Bn r4, ...,n, 
Xx, VEL, I(x, y)— 
it suffices by (10.6) to show that 


n-—1 
(10.7) >) Veso092,l3,. 
£—n-r4 


With C, denoting generic finite constants that are independent of n, for any 
£ € [Bn + 4, n) and x € Z, there are at most Core. v/ Taped points y € Z4 N 
D(x, 2(rg41 + 1)). Consequently, we have by (10.5) and the definition of I (x, y) 
that for any £ € [Bn + 4, n), 


2 a*4-e 
V; « CalZnl( F4] ) genter ( 2) : 
l'8n44 re 


Similarly to the derivation of (4.9), taking e < 2 — a* and summing over £ results 
with (10.7), hence completing the proof of (10.3). 





PROOF OF LEMMA 10.1. Let R%,, ,, denote the time until completion of 

the first m excursions from 0D(z,rgn—1) to 9 D(z, rgs), and set Az, ES {wz > 
25 mee 

Keo) for WE = {y € D(z, rga a) Tk, (Y) > £5, m)l: Recall that figs = 


n gn (y? ot + £)), so applying (3.19) with R =rgn, r =r gn—1 and N = fg, + Bn, 
we see that for all m < fig, + Bn, 


P(Wz  D(z,rgn-4) N Lr, (œ + £) y? 8?)]) 
> P(A5, m (a - E)y" P K) = 1— o(14). 


Hence, by Theorem 1.2 we have that 


(10.8) P(ÁL)-1-—o(1, uniformly inm ^ fgn. 


LATE POINTS FOR RANDOM WALKS 261 


Starting at 0 ¢ D(z, rgn) we see that the event AZ, belongs to the o-algebra J€* (m) 
corresponding to r = rgn—2, R =rgn—1 and R' = rg, in Lemma 2.4. Further, if the 
event (N* nn = =m} € Fan occurs, then the law of A5, = = {We > > Kn BQ—ay’)— g 


conditioned upon $* Bn is the same as the law of AZ, conditioned upon 8: Bn" Conse- 
quently, by Lemma 2.4 and (10.8), uniformly in m ^ fgn, 
(10.9) P(A$,185,, NZ gn =m) = PÓZ, Gn Ne gn — m) = 1— O(n). 


With Mj = (I, ..., n — 1), by (10.9) and the fact that {NF , A fix, k € Mpn} € Gan 
we get that 


1 


P(z is (n, B)-qualified) = P(N; , A fi, k € Mgn; Abn) 


> E(N NL, S fp, k € Menai 


Br. 
m^" fin 


N pn =M; P(A bn Gon» Ns gs = m)) 


k ~ 
= (1+ o(1n)) P(N; , ^ fix, k € Men). 
Therefore, taking mn = nn = nn (a), by (5.9) we get (10.4) for 


n~1 
FE mg — 1 
40919 à4- Y. [T] (men +m ) p? (1 po". 
Hi gp H1 £n £ 
Ime —-ng| xe 

It is not hard tó check that our choice (10.2) implies that for some C « oo and all 
k € Men, if |m — fi| < k and |l + 1 — £Zy,1| <k +1, then 

| E 2 C 


1 < —, 
^ klogk 


| ^ k-(B—y8)0 - yB))n 
which by adapting the proof of [3], Lemma 7.2, shows that uniformly in m; A nk 





kila 
and my. ~ "ki, 


C'k-9 3 (miu tm 1 j-3a*-1 
an ee Tema ) meg pye e CK 
( ) logk ~ ( my py CL — py) ^ Viogk 


with 0 « C', C « oo independent of k. Putting (10.10) and (10.11) together we see 
that n = (Tn/ rpn) 4 ten) as Claimed. 

It suffices to prove the upper bounds of (10.5) and (10.6) with the vets 
{z is (n, B)-qualified} replaced by the larger events A(z, n, B) := {N} ; S he; 
k € Mgn}. The proof is a rerun of the argument used in Section 5 io prove 
(4.5) and (4.6), respectively, replacing the events {z is n-successful) by A(z, n, 8), 


262 DEMBO, PERES, ROSEN AND ZEITOUNI 


taking o = f and fn + 4 instead of p’n, excluding 0 from the sets J; and J; and 
replacing everywhere there q, with n, ng with 7; and a with a*. Indeed, the effect 
of the values 77; is in the application of (10.11) whenever (5.7) is used in Section 5. 

O 


11. Complements and unsolved problems. (A) Let L7 denote the number 
of times that x € Z^ is visited by the simple random walk in Z? up to the time 
T5p(0,n) Of exit from the disc of radius n. For any 0 < a < 1, set 


Ly 

> 4 
dg =n]: 
Since log Ta D(0,n)/ logn — 2 almost surely as n — oo du e.g., [8], equation (6)), 
our result ([3], (1.3)) is equivalent to 
log |Y, (œ)! 


n-—-oo logn 





(11.1) V, (œ) = {x € D(0,n): 


(11.2) =2(1— a) a.S. 
Following the line of reasoning of this paper, we expect that for any 0 <a, B < 1, 
choosing Y, uniformly in VV, (o), 


log |W, (o ' D(Y,, n? 
(11.3) nsus SEL A Cm 
n—>00 logn 


= 2B(1 — a) a.S. 


We also expect that the analysis in this paper can be extended to yield 


log |(x, y € Yn (œ) : d(x, y) x nf Y 


logn =p(a,B) as. 


(11.4) m. 

(B) Our study of planar random walk suggests that the analogous results hold 
for the planar Wiener sausage. Let S,(t) = (x € T?:3s < t, |W; — x| < £} denote 
the set covered by the Wiener sausage up to time £, where W, is the Brownian 
motion on the two-dimensional torus T?. Consider the uncovered set U;(a) = 
T? \ Se Qa (log £)? /z) for 0 < a < 1 (in [4] we show that U, (o) is empty if a > 1). 
With £eb denoting Lebesgue's measure, we then expect that 

log Leb(U;(a)) 


(11.5) lim ——— — —— — = 2a a.s. 
£0 loge 


and for any x € T?, 1 > B > Ja, 


Leb(U;(a) N D(x, 61 F 
ie gu Oe Opi oaey 
e—0 log € 
We also expect that for 0 < a, B. < 1 and Y; chosen according to Lebesgue measure 
on, Us (a), 


log Leb(Us (a) N D(Ys, e!~F)) 


= 2—28(1 — a) a.s. 
loge 


(11.7 lm 
£—0 


LATE POINTS FOR RANDOM WALKS 263 


and that 


lo Jeb(U, (o) N D(x, e!-#)) dx 
(18) lim g fr, (o) (Ue (œ) ( )) 


= 4 — p(a, B) a.s. 
£0 loge 


We believe that these results can be derived by arguments similar to those used 
here, but have not verified it. 


Acknowledgment. We thank Bertrand Duplantier for pointing our attention to 
the work of Brummelhuis and Hilhorst [1] in connection with our earlier work [4] 
on cover times. 


REFERENCES 


I1] BRUMMELHUIS, M. and HILHORST, H. (1991). Covering of a finite lattice by a random walk. 
Phys. A 176 387-408. 

[2] CHUNG, K. L. (1974). A Course in Probability Theory, 2nd ed. Academic Press, New York. 

[3] DEMBO, A., PERES, Y., ROSEN, J. and ZEITOUNI, O. (2001). Thick points for planar Brown- 
ian motion and the Erdós- Taylor conjecture on random walk. Acta Math. 186 239-270. 

[4] DEMBO, A., PERES, Y., ROSEN, J. and ZEITOUNI, O. (2004). Cover times for Brownian 
motion and random walks in two dimensions. Ann. Math. 160 433—464. 

[5] FITZSIMMONS, P. and PITMAN, J. (1999). Kac's moment formula for additive functionals of 
a Markov process. Stochastic Process. Appl. 79 117—134. 

[6] KAHANE, J.-P. (1985). Some Random Series of Functions, 2nd ed. Cambridge Univ. Press. 

[7] LAWLER, G. (1991). Intersections of Random Walks. Birkhauser, Boston. 

[8] LAWLER, G. (1993). On the covering time of a disc by a random walk in two dimensions. In 
Seminar in Stochastic Processes 1992 189—208. Birkhauser, Boston. 


A. DEMBO 

DEPARTMENTS OF MATHEMATICS 
AND STATISTICS 

STANFORD UNIVERSITY 

STANFORD, CALIFORNIA 94305 

USA 

E-MAIL: amir@ math.stanford.edu 


J. ROSEN 

DEPARTMENT OF MATHEMATICS 
CITY UNIVERSITY OF NEW YORK 
COLLEGE OF STATEN ISLAND 
STATEN ISLAND, NEW YORK 10314 
USA 

E-MAIL: jrosen3 @earthlink.net 


Y, PERES 

DEPARTMENTS OF MATHEMATICS 
AND STATISTICS 

UNIVERSITY OF CALIFORNIA 

BERKELEY, CALIFORNIA 94720 

USA 

E-MAIL: peres stat.berkeley.edu 


O. ZEITOUNI 

DEPARTMENTS OF ELECTRICAL ENGINEERING 
AND MATHEMATICS 

TECHNION, HAIFA 32000 

ISRAEL 

AND 

DEPARTMENT OF MATHEMATICS 

UNIVERSITY OF MINNESOTA 

MINNEAPOLIS, MINNESOTA 55455 

USA 

E-MAIL: zeitouni 2 math.umn.edu 


The Annals of Probability 

2006, Vol. 34, No. 1, 264-283 

DOE: 10.1214/009117905000000440 

€ Institute of Mathematical Statistics, 2006 


GAUSSIAN ESTIMATES FOR SPATIALLY INHOMOGENEOUS 
RANDOM WALKS ON Zł 


BY SAMI MUSTAPHA 
Institut Mathématique de Jussieu 


It is shown in this paper that the transition kernel corresponding to a spa- 
tially inhomogeneous random walk on ZZ admits upper and lower Gaussian 
estimates. 


1. Introduction. We shall consider in this paper spatially inhomogeneous 
random walks (5;) ;jex with bounded symmetric increments in Z4. More precisely 
let = -T C Z bea symmetric finite subset of Z? and let zx : ZÓ x T —> [0, 1] 
such that 


Yr, e)= 1, n(x,e)=7m(x, —e), eel;xez. 


ecr 


Then we let ($7) jen be the Markov chain defined by 
P[$jjj12x-e//Sjex]enm(xe) e&€T,xeZ*,j=0,1,.... 


To avoid unnecessary complications we shall assume that F contains 0 and all unit 
vectors in ZZ, that is, all e with |e] = 1 where | - | denotes the Euclidean norm. 
Furthermore we shall impose the following ellipticity condition: 


(1.1) m(x,e)>a, xez eer, 


for some a > 0. It must be emphasized that the random walk (S;)jen is not nec- 
essarily reversible. | 
We shall denote by 


(1.2) Pn(X,y)=P,[S,=y],  n=1,2,..., x,y e ZÍ, 


the transition kernel corresponding to the chain (S;) jen and by L the correspond- 
ing generator, that is, the difference operator defined by 


03) Lf@)=Yir@e(fa@t+e—f@), fZ >R. 


eer 


Received June 2003; revised November 2004. 

AMS 2000 subject classifications. Primary 60710, 60J35; secondary 60J45, 31C20. 

Key words and phrases. Markov chains, transition kernels, Gaussian estimates, discrete potential 
theory. 


264 


INHOMOGENEOUS RANDOM WALKS ON Z4 265 


We shall prove that there exists a unique (up to a multiplicative constant) positive 
solution M (-) of the adjoint equation 


04  L'MQ)-Yzx-eeM(x-e-MG)20,  xeZ, 


eer 


globally defined on ZZ (cf. Section 3.2). We shall denote 


Vo,r- M. MQ), xeZ'r-0, 
z€ B, (x) 


where B, (x) = (y € Z4, ly —x| «er, x e Z4, r » 0. 
THEOREM 1. Let (Sj) jew be as above. Let ps(x, y), x, y eZ? n=1,2,. 


denote the corresponding transition kernel. Then there exists C > Q, depending 
only on d, T and a, such that 


> Eri) 

A 5) PG) 3 Tay (y, faye ( as 
nzlxyez, 
(3) E AE ey eca 

(1.6) SHIT E CV, MV O, Jm) " i 


nzlixyeZx-y|xn/C. 


The following comments may be helpful in placing the above theorem in its 
proper perspective. 


(1) It will be shown in Section 3 that the volume function V (x, r) satisfies the 
doubling property 


(1.7) V(,r) < CV(x,2r), x€Z,r-0. 


The volume factor (V (x, /n)V (y, /n))!? in (1.5), (1.6) can therefore be re- 
placed by V (x, Jn ). 

(ii In the reversible case, Theorem 1 is an immediate consequence of 
Delmotte's work (cf. [5D. Delmotte proved the equivalence of the upper and lower 
Gaussian estimates to the volume doubling property (1.7) plus the Poincaré in- 
equality for reversible Markov chains with bounded increments on graphs. His ap- 
proach relies on a clever adaptation of the Moser iteration process. The reversibility 
property p(x, y)m(x) = p(y, x)m(y) verified by the kernel of the chain (S;) jen 
and its invariant measure m plays a crucial role in [5] (cf. also [2]). 

(iii) Reversible Markov chains on Z are the discrete analogues of diffusions 
generated by second-order differential operators in divergence form and the inho- 
mogeneous walks can be considered as the analogues of second-order operators 


266 S. MUSTAPHA 


in nondivergence form (cf. [12], Table 1, page 78). The first two-sided Gaussian 
bound for fundamental solutions of parabolic equations in divergence form with 
measurable coefficients is due to Aronson (cf. [1]). For operators in nondivergence 
form such upper and lower Gaussian estimates were proved only recently by Es- 
cauriaza in [6]. 

(iv) In Aronson’s work the parabolic Harnack inequality is used to obtain the 
Gaussian lower bound (cf. [1]). In fact both upper and lower bound for the heat 
kernel can easily be deduced from the parabolic Harnack principle (cf. [19]). Con- 
versely it is shown in [10] that the two-sided Gaussian bound implies the Harnack 
inequality. Saloff-Coste showed in [16] that the parabolic Harnack inequality (or 
the two-sided Gaussian bound) for a divergence-form second-order operator (or 
for the Laplace—Beltrami operator on a Riemannian manifold) is equivalent to a 
family of Poincaré type inequalities for balls and the doubling property (cf. also 
[11]). The results of [5] are the discrete counterpart of [16]. It will be interesting to 
discuss this type of equivalence for both nondivergence differential operators and 
nonreversible random walks. 


A general outline of the paper is as follows. Section 2 collects the main poten- 
tial theoretic properties of spatially inhomogeneous random walks on Z°. The two 
new results of this section (Theorems 4 and 5) are of independent interest and are 
proved in Section 5. In Section 3 we define the concept of normalized adjoint solu- 
tion adapted to spatially inhomogeneous random walks and we prove that adjoint 
solutions verify a parabolic Harnack principle. This Harnack principle is used in 
Section 4 to deduce the Gaussian estimates of Theorem 1. 


2. Potential theory. Let A C Z7 denote a bounded domain (i.e., a finite con- 
nected set of vertices in Z7). We let 
9A ={x e A^, x =z +e, forsomezeAandecT], 
l being as in the previous section, and 
A=AUOA. 
Let B = A x (a xk x b) C Zd x Z where A C Zf and where a < b € Z. We let 
B= |) dA x {k}, 


a<k<b 
0,B = 0B U (A.x {a}), 
and B = BU9,B. 9, B is the parabolic boundary of B and 3; B is its lateral bound- 
ary. We say that u : A —> R is harmonic in A C Z4 if 


Lu(x) — > axx, e)(u(x +e) — u(x)) =9, x € A. 


ec! 


INHOMOGENEOUS RANDOM WALKS ON Zf 267 


Let B= Ax (a xk xb) CZ x Zandu: B — R. We say that u is caloric in B 
if 
£u(x, k) = ) n(x, e)u(x t e, K) — u(x, k 4-1) =0, 
eer 


(x,k) e x rlaxk«b). 


The following maximum principle is immediate. 


THEOREM 2 (Maximum principle. Let B= A x {a < k < b} C ZÍ x Z, 
where a < b € Z.and A is a bounded domain in Z and let u: B — R such 
that Lu =Q in B and u > 0 on 05 B. Then u > 0 in B. 


The following theorem (cf. [13]) is a random walk version of a well-known and 
fundamental result in the potential theory of second-order equations in nondiver- 
gence form (for an elliptic version cf. [14]. 


THEOREM 3 (Parabolic Harnack principle). Let u be a nonnegative caloric 
function in B2,(y) x (s - 4r? xk < s}, (y, s) € ZÊ x Z, r > 1. Then 
sup(u(x, k); x e B-(y), s — 3r? <k<s— 2r?) 


(2.1) 5 
< Cinf(u(x,k; xe B,(y,s—r^«k«s), 


where C = C(d,a,Y) » O. 


In the proof of Theorem 1, together with the previous results we need the follow- 
ing estimates which describe the boundary behavior of nonnegative caloric func- 
tions. Let yo € 74 , let Ro > 0 and let 2 = Br, (yo). 

Let Q = Qx Z, let Y = (y, 5) € OQ x Z and let c < r < Ro/2 where c > 0 
denotes a constant times diam(I'). We shall denote 


C,(¥) =B, 0) x {s-r <k<s}, | Q,(Y)—-QnC,(Y), 
Y, = (yrs +27), Y,=Gr,s—2[r?), 


where y, € Q satisfies |y. — (R — r/2) TES < 1 and [r] denotes the greatest 
integer <r. 





THEOREM 4 (Boundary Harnack principle). Let Y —(y,s) € IQ x Z. Let c < 
r < Ro/K where K > 0 is large enough. Assume that u and v are two nonnegative 
caloric functions in Q N (B3g, (y) x (s -9K?r? xk <s +9K 2r2)) and u =0 on 
(8Q x Z) N (Bag, (y) x (s -AK?r? xk € s +4K?r?}). Then 


(2.2) sup + <C u(Y Kr) 
Q.(Y) U v(Yy,) 





where C =: C(d,a, T) » 0. 


268 S. MUSTAPHA 


THEOREM 5 (Backward Harnack principle). Let u be a nonnegative caloric 
function in B, (yo) x N (yo € ZZ, r > 0 large enough) vanishing on 9 B, (yo) x N. 
Then 


(2.3) wx kc2p])zCu(,k,  (x,k) € B,(y0) x {r7 <k < 3r°}, 
where C =C (d, œ, T) > 0. 


The proofs of the estimates (2.2) and (2.3) which follow strongly the proofs of 
the corresponding facts about the boundary behavior of nonnegative solutions of 
second-order equations in nondivergence form (cf. [3, 4, 7-9, 15]) are given in 
Section 5. 

We shall use throughout the usual convention f. g to indicate that 
C^ < f/g <C for an appropriate constant C > 0 and C, c are used to denote 
different positive constants which depend only on d, a, diam(T). 


3. The adjoint Harnack principle. Let D C Z^ denote a bounded domain 
and a < b € Z. We say that v = v(x,t): Dx {a <t <b} — R is a parabolic ad- 
joint solution of L in D x {a x t x b), if v satisfies the equation 


v(y, t - D) — v(y, t) = L*v(y, t), t—a,...,b—l,yeD, 


where L* is defined as in (1.4). Let m(-) be a fixed positive adjoint solution for L 
in D fi.e., m: D — R, m(x) > 0, Vx € D, L*m =0 in D]. For instance, if D C 
B, (0) lies in the Euclidean ball centered at 0 and of radius ro > 0 largé enough, 
we can set m(x) = G(x*, x) where x* € B4,, (0) V B34,(0), with G(., -) being the 
Green function of (S;) jew in the ball Bsr (0). Let v be a parabolic adjoint solution 
in D x (a € t x b}; the function 


i v(y, t) 
m(y) ` 


is called a normalized parabolic adjoint solution of L in D x {a € t x b) (cf. [4]). 


(y,t)e Dx {a<t <b}, 





u(y, t) 


THEOREM 6. Suppose that v is a nonnegative normalized adjoint solution 
for L in B, (yọ) x N, where yo € Z? and r > 0 large enough. Then there exists a 
constant C > 0 depending only on d, a, T such that 


u» sup{i(y, s) ye Brao) r’ <s <2r7} —— 
| < Cinf{i(y, s); y € Br (yo); 3r? < s < 4r). 


PROOF. Let B, (yo) = B, and let g;(-, -) denote the Green function of £ in B,. 
An easy induction on £ gives the following representation formula for the parabolic 


INHOMOGENEOUS RANDOM WALKS ON Zł 269 


normalized adjoint solutions: 





i, = meyer, o 22» 
x€B, my ) 
(3.2) £y Y m(x)i(x,s) D> xx, pon c 
$—0x€8B, ecT, m(y) 


ye Brp,tz1,2,..., 


where I', = (e € T/x +e e B,}. On the other hand, let us observe that if we extend 
qs ^, y) by gs(-, y) =0, s < 0, in an appropriate neighborhood of 3B, and use the 
boundary Harnack principle (2.2) to compare u1(x, £) = qi(x, y) and u2(x,t) = 
Q,410,2 (X, y), combined with the backward Harnack principle (2.3), we deduce 
then that 


qs(x, y) € Cqy 2x, y), dist(x, 0 Br) Scr, 
(3.3) 
y € B2,0<s< 2r?. 


Let now yi, y2 € B,jy? and r? «t < 2r?, 3r? <h « 4r?, By (3.2), (3.3) and 
Theorem 5 we have 


901.1 D ma) immo 
(3.4) . 
£CY) Y) moia) Y, ra, 9228 09. 
IPEA eel’ m(y1) 
v(y2,12) 2 2. m(x)v(x, oen 
(3.5) 


je y2) 
x ni IT 2- m(x)v(x, s) 5. (x, O ea 


C 5-0 xe08, ecry (Q2) 
A simple use of the boundary Harnack principle [combined with (2.1) and (2.3)] 
allows us to deduce then 
m(yi)  4,2(yo, yo) 
Thus, to prove (3.1) it suffices to show that 


q,2 yo, Y1) 4, dr? (yo. y2) 
m(y1) m(y2) 


The estimate (3.6) is a consequence of (3.4), (3.5) and the fact that 1 is a normal- 
ized parabolic adjoint solution. Indeed, let Y (x) = G(x, yo) be the Green function 


v(ya, t2). 


(3.6) ; yp. y2 € Br. 


270 S. MUSTAPHA 


of L in B, with pole at yo. Let A € B, fixed with |A — yo| ^ r/4. By the boundary 
Harnack principle and the backward Harnack principle we have 


X q,2 yo. y) 
qy2(x, y) 2 v(x) ———— V (A) ; 


If we apply now (3.4) and (3.5) to ? = 1, we deduce then 


myi)<C » m) aiv. Y) C 2; m(x)4,2(yo. y1) 
X€B, — Ba, 74 X€Barj4 


*Cr^ 3, m(x) 31 1G. aio. aub?) I. 
xca B, eel, v(A) 


x € B, \ B3y 14, ye Br. 





I | l 
my)z- > m(x) e 400.2 + a >> m(x)q,2(y0, y2) 


C x€ By — By 4 VA) l xEB3rj4 


j“ V (x +e) 
iP ear uis ,€)q, (yo. Y2) ———— P) 


and these two inequalities imply (3.6). L1] 


3.1. The doubling property for the adjoint solutions. We start with a doubling 
property for the Green functions. 


PROPOSITION 1. Let R be large enough and let xo € Z? . Let G* (., -) denote 
the Green function of (S;) jen in the ball B4g(xo). There exists then a constant 
C > 0 (independent of xo and R) such that 


3.7) X G*G,ypzC Y) Gl*'G,y, x € Bro), 1 <r<R/2. 
ye Bo, (xg) y€ B, (xo) 


This proposition is a consequence of the parabolic Harnack principle and the 
following lemma. 


LEMMA 1. Let R be large enough and let xg € ZÊ. Let hF(x,y), 
t=0,1,...,x, y € Ban(xo), denote the heat kernel of (85) jen in the ball Bar (xo). 
Then there exists c > 0 (independent of xo and R) such that 
(3.8) inf JO mAh yze  1<s<r?,1<r z< R/2. 


ze B, (x 
0) ve B. (xo) 


Indeed, by the parabolic Harnack principle we have 


C inf 5, hyiy)z sup  », hy) 
2€ Bo, (xo) ye B, (xo) z€ Bor (xo) y€ B, (xo) 


> inf SY Agy) zc. 


EBra yep, (xo) 


INHOMOGENEOUS RANDOM WALKS ON Zf 271 


We have then 
Y. Awe Y O hoa.) 
y€ B, (xg) y€B, (xo) u€ Bo, (xo) 
> b» ini ( > Hate JF) 
uc Bo, (xo) “P7 C Veg (xo) 
X hfG.w) 
u€ Bo, (xo) 


and this implies the doubling property for the heat kernel with the time shifted 
t — t + 2r?. If we sum on f we deduce that 


D Peys } X hy) 


y€Bo, (xo) y€Br(xo) r—2r? 
«C > G*(xy) 
y€B, (xo) 
The adjoint Harnack principle (3.1) and Proposition 1 give 


THEOREM 7. Let L*m=0,m > 0, in Ba-(z),z€ Z andr >0 large enough. 
Then 


(3.9) Yo mO)<C Y mo), 


ye Bo, (2) yeB, (z) 
where C = C(d,a,T). 


Indeed let x* € Ber (z) \ Bsr (z). Let G(., -) denote the Green function of (5j) jen 
in the ball B7,(z). We have 


m(y) 


my) <C sup 2, EY) 
ye Bo, (z) yEBy,(z) G(X", y) y eB» (2) 
my) | 
<C in ?, Ga*,y’) 
ye Bo, (z) G(x*, y) Y) y y'€ B», (z) 
m) | E 
y'€ B, (z) E 


The second inequality follows from the adjoint Harnack principle and the third one 
from the doubling property (3.7). 
PROOF OF LEMMA 1. Let 


U(x,s)= J, hjQy».  xe€Ba()sz0. 
ye Bo; (xo) 


272 S. MUSTAPHA 


Let V (x, s) be the caloric function defined by 
Vo,s-1)-5'zQ,eVG-ces) in Bo, (xo) x {—4r? < k < 4r?}, 
ee. 
V(x,s)=1  on8p(Bo»(x9) x {—4r? x k < 4r7}) n (s <0}, 
V(x,s) 0 | on ðp(Bz (x0) x (-4r? < k < 4r?)) n (s > 1). 
By the maximum principle we get 
U(x,s)>V(x,s), s > 0, x € Bo, (xo). 


Using the parabolic Harnack inequality applied to V (x, s) in Ba (xo) x (—4r? < 
k < 4r*} we deduce that 


inf U(z,s)>cV(xo, —r”) =c, O<s <r’. 
z€ B, (xg) 


[Note that V = 1 on By (xo) x (-4r? xk < 0}] O 
3.2. The global adjoint solution. 


THEOREM 8. There exists a positive adjoint solution M defined globally 
in Z7. This solution is unique up to a multiplicative constant and verifies 


(3.10) Y Msc Y MG),  xeZr-0, 
yE Bo, (x) y€B, (x) 
where C = C(d,a,Y). 


PROOF. Let 
mi(y) = @[G141(0, y) - GI(0, y), — y € By(0,1—1,2,..., 
where G;(0, -) is the Green function of (Sj) jen in the ball By (0), 1 = 1,2,..., 
with pole at the origin and where the a; are chosen so that 
(3.11) m,(0) = 1, Iz012:56 3 


It is easy to see that the ellipticity condition (1.1) implies a local Harnack prin- 
ciple for the nonnegative adjoint solutions. This local Harnack principle and the 
normalization condition (3.11) imply that the m; verify 


miy)<C,  yeBa(0,lzk, 


with a constant C = C(k) depending only on k. The diagonal process allows us 
then to deduce the existence of a global positive adjoint solution M defined on Z4. 
The fact that this global adjoint solution is unique (up to a multiplicative constant) 
follows from the normalized adjoint Harnack principle (3.1) applied to M1/M; 
where M and M» denote two global positive adjoint solutions. Indeed it is always 


INHOMOGENEOUS RANDOM WALKS ON Z4 273 


possible to suppose that infza M1/M» = 0 and associate to £ > 0, Zs € Z4, such 
that Mı (ze)/M2(ze) < e. By (3.1) supp, z,; Mi/M2 < Ce with a constant C > 0 
independent of R and it suffices to let R — oo and e — O to deduce that this 
function is constant. Finally, the doubling property (3.10) follows from Theorem 7. 
This completes the proof of Theorem 8. LJ 


4. The Gaussian estimates. The first step in proving the upper Gaussian 
estimate (1.5) is to prove the following mass escape estimate for (5;) jen 
(cf. [17, 18]). 


LEMMA 2. Let (Sn)nen be as in Section 1. Let p(x, y), x,y € Z4, n -— 1, 
2, ..., denote the corresponding transition kernel. Then there exist C, c > 0, such 
that 


R2 
(4.1) >. pa(x, y) < C exp (--—), xé€Z?,n,R=1,2,... 
Ix—yl>R 


PROOF. To prove (4.1) it suffices to prove that 


(4.2) Y) mo y) se 
l(x—y)-R 


for every linear form / : Rf — R such that I(x — y) < |x — yl, x, y € Rİ. Let s > 0; 
we have 


2. Pans X, 47999 95 


l(x—y)-R l(x—y)-R 
« e 5 »» eS!) p, (x, yye 40) 
yezd 
from which it follows that 
»3 Pn(x, y) 
| [a-y)>R 
« e 58 y (e? €) p (x, ype OV) 
Yis Jn- 1627 


(4.3) 
x (e OD pi (y1, yje 7502) x ... 


X (e? 0-0 p, NOR ye), 


274 S. MUSTAPHA 


On the other hand, we have 


y eO) p; (y, ye 310? = > my, eye) 
y'ezd eel” 


=l—s > 7 (y, e)l(e) + O(s*e“) 
ecr 
— 14 O(s^e*), 
where the last equality follows from the fact that (S,),cN has symmetric incre- 
ments. We deduce then that 


Y^ eO pi(y, ye 8 
yl cz 


(44) — sup «1-Cs?«eC7, — |s| «1. 


yezi 








If is| > 1, it suffices to observe that 


Y. eO p, ye’ 
y'eZ4 








<J rO, eer!" 
eer 


(4.5) 


Putting together (4.3), (4.4) and (4.5) we deduce that 
3o nape. 
l(x—y)»R 


and optimizing over s, we deduce (4.2). Œ] 


The second step is to apply the parabolic adjoint Harnack principle to 


2 pix, y) 


, € ZZ, ¢=1,2,.... 
MO) d 





v(y, t) 


This gives 
v(iy,n)<C inf v(z,2n), n>C, 
z€B (y) 


n 


Hence (with the notation of Section 1) 


öy, n) V (y, Vn) sC 5, 9(q,2n0M(2s Pj, Prz) nzC, 
z€B mO) ZEBelx—y| (x) 


in the case ./n < c|x — yl, with c > 0 small enough. This implies, in this case, by 
Lemma 2 


: Ix — y 
o(y, n)V(y, Vn) x Cexp mph n>C. 


INHOMOGENEOUS RANDOM WALKS ON ZZ 275 


In the case ./n > c|x — y| it suffices to observe that the Gaussian factor in (1.5) 
is © 1. Hence 


nC. 


— 





CM(y) ( Ix — 2) 
3 < ee ON E , 
Pn (x y) Vy, Jn). p C n 
The doubling property (3.10) allows us to symmetrize the volume factor in this 
estimate and obtain 


n>C. 





CM(y) Ix — y|? 
Pa YS OG AOV y, JUR (~ ) 


Let us observe that for 1 < n < C (1.5) and (1.6) are immediate consequences 
of the local Harnack estimate. This completes the proof of the upper estimate in 
Theorem 1. Finally the lower Gaussian estimate (1.6) can be deduced from the 
upper Gaussian estimate (1.5) and the parabolic adjoint Harnack-by a standard 
procedure. We first use (1.5) to deduce that for A > 0 large enough 


(4.6) Y» mezzi 
Ix-y| A n 

Parabolic adjoint Harnack applied to 

~ Pt (x sY ) d 

ü(y, t) = —, VER t= 152 22%, 

MO) 
estimate (4.6) and the doubling property (3.10) imply therefore that 
M(x 

(4.7) e) x eZ, n» C. 


pa x) > VaM) 


The lower off-diagonal estimate (1.6) is easily deduced from (4.7) by applying 
successively the parabolic adjoint Harnack inequality. More precisely, let us fix x 
and n as in (4.7) and let y € Zd such that |y — x| < n/C with C > 0 large enough. 
Let k be the smallest integer > |x — y|*/n. Put 


jt) = (aj (1+4)n), TU Wai 


|x — yl 
ao =x, dk = y, lajm — aj] 7v EC O<jsk-1. 


with 





Then (x, n) = (ao, n) and (y,2n) = (ax, 2n). Moreover 
lay — aj ~ 0xjzk-l. 
Hence the parabolic adjoint Harnack inequality yields 


Pon Y) _ c Pn Qc X) ( x — A) 
Pi e — — exp ( - >C. 
M) -  MQ) -CVG, Jn) PN ^ m ds 
This completes the proof of Theorem 1. 








276 S. MUSTAPHA 


5. Proofs of results. To the process (Sj);eN we shall associate the corre- 
sponding space-time process 


$;2(Sj, t) eZ xZ,  1t€Z7,...j-0,1.... 


For any cylinder Q = Q x (a < k < b}, Q being a bounded domain in Z4 , we shall 
denote Tg the first exit time of S; from Q. The caloric measure in Q at (xo, to) is 
defined by 


yf ) 1 
of) Polig EE] ECO. 


Observe that for each 9:95 ($2 x {a < k x b}) —> R, the solution of the boundary 
value problem 


u(x,t+1)= > x(x, e)u(x -- et) in Q x {a <k <b}, 


ecl 


u(x,t) = g(x, ft) on 0,(Q x [a x k <)}), 


can be represented by means of w*00 = oor , (xo, to) € Q x {a < k < b} as 
follows: 


u(xo, to) = Evy.) [9(Sig = DL gO, s)o*(y, s). 
(y,8)€8p 0 


5.1. A lower estimate for the caloric measure. Let the notation be as in Sec- 
tion 2 and let Q = Q x Z. For Y = (y, s) € & Q, r > 0 we shall denote 


A,(Y) = %0 N C, (Y), 
where 8; Q = 0€ x Z. 


X QD 
LEMMA 3. Leto^ = 90, (Y): Then 


l inf o*(As(Y) 280, Y €40,c<r « Ro/2, 
(5.1) y If, o (Ao 0) 2 IQ, c Sr < Ro/ 
where 0 — 0(d,a, Y) » O0. 


PROOF. Let Y = (y,s) € 9;Q. It is clear that there exists a cylinder 
C' = B r (2) x (s — 4r? < k < s} C Co, (Y) \ (Q x Z) (provided that yz is small 
enough). Let A’ = By, (z) x (s — 4r?) denote the bottom of this cylinder. Using 
the maximum principle we deduce 


(5.2) w* (Ax (Y)) = v(X) = o, gy (A 
in Q5, (Y), and 
v(X) > v(X) = o (A) 


INHOMOGENEOUS RANDOM WALKS ON ZZ 277 


in @’. On the other hand, the parabolic Harnack principle applied to v gives 


inf wX(Ao-(Y))> inf w(X 
xeQ.a ^ (Aor ( 2 aedi y (X) 


(5.3) —cv(z,s— 2r?) 
> cv (z, s —2r^). 
But v’ can be extended from @’ to a large cylinder C” = Byr (z) x [s — 6r?, s] by 
(5.4) v' (X) = oğ (85G" N {t <s — 4r*)) 
so that v' = 1 on C” N {t < s — 4r?) and the lower estimate (5.1) follows then from 


the Harnack principle. L] 


COROLLARY 1. Under the assumptions of Lemma 3, let u be a nonnegative 
solution of Lu =0 in Qa, (Y), which vanishes on ^2, (Y). Then M, = supo, (yu 
satisfies 


(5.5) M, < pMa, C < r < Ro/4, 


with a constant 0 < p = p(d,a, T) < 1. 


PROOF. Let X € Q,(Y); we have 
u(X)- Y u(Z)w*(Z)= > u(Z)o* (Z). 
Ze9, Qa, (Y) Z€8p Qo (Y)- Az; (Y) 
Hence 
u(X) < w* [8p Qo. (Y) — Aor (Y)]Mo, 
= (1 — o* [Az (Y)]) Mo, 
< (1 —0)M», = pM». E! 


5.2. The Carleson principle for caloric functions vanishing on the boundary. 
Let the notation be as above. Let Y = (y, s) € gQ and c <r < Ro/2. Assume that 
u is a nonnegative caloric function in Q N (B3r(y) x (s -9r? xk <s + 9r?]) and 
u —0on8jQ (Bo, (y) x (s — 4r? xk < s +4r7}). Then 


(5.6) u(X)Cu(Y), Xe€Q,(¥), 


where C = C(d,a, T) > 0. To prove (5.6) we first observe that the local Harnack 
principle allows us to assume that the parabolic distance of X from 95 Q», (Y) is 
sufficiently large. We shall denote by 6(X) [X € C5, (Y)] this distance and sup- 
pose that (X) = Dist(X, 9p Qo, (Y)) = C. Geometric considerations in combina- 
tion with the parabolic Harnack principle imply that 


(5.7) ô” (X)u(X)<Cr’u(¥,), | X€Q»(Y), 


278 S. MUSTAPHA 


where y and C > 0 are positive constants depending on d, a, IT. Let 
(X) = Dist(X, 05 C», (Y)). Let 0 < e9 < 102 small enough so that 


NE aj 
(1 — 429)" 
where p is the constant given by Corollary 1. We shall distinguish two cases. First 
assume that ô = ó(X) < sgó(X). In this case we have 
ô = 6(X) = ô(x, t) = dist(x, AQ) = |x — xol for some xo € AQ. By Corollary 1 
applied to u in Qos( X9) = Cos(xo, t) N Q, we have 


(5.8) ĝo = 


u(X)< sup u<p sup u. 
Q/2)5 (X0) Q3s(Xo) 


But 
(X) x 8(Z) +48 < (Z) + 4e05(X) 
where Z = (z, t) € Qas (X6) is such that u(Z) = Supo, (xo) 4. Therefore 
(X) < (1 — 4&9)! 8(Z) 
and this gives 
8(X)*u(X) < (1 — 480)” p$(Z)' u(Z) 


<6) sup 4(X)’u(X). 
X€Q», (Y) 


(5.9) 


It remains to examine the case where 6 = 6(X) > £05 (X). In this case we have 


(5.10) 8(X)'u(X) < eg YE(XY uU(X) x eq? sup ó(X)"u(X). 


ar (Y 
Putting together (5.9) and (5.10) we deduce that 
sup à(X)'u(X) < max (& sup ó"u, Eg” sup Pu). 
Qor (Y) Q2, (Y) Q» (Y) 
Using (5.8), the fact that 200 5s ~r, X € Q,(Y) and (5.7), we deduce the esti- 
mate (5.6). 
5.3. The boundary Harnack principle and proof of Theorem 4. For 
—(y,s)eOQ x Z; c <r < R < Ro/2, we denote 
r,r (y) = Bna(y) N {x € Q, dist(x, 3N) « rj, 
Dg (Y) = Dg, = Qg,(y) x {9 — R^ <k <5}, 
AR, r(Y) = An, = 3p DR,r (1x € Q, 0 « dist(x, dQ) <r}, 
SR,r (Y) = SR,r = 0p Dg, N {x € G,dist(x, 0$2) = rj. 


INHOMOGENEOUS RANDOM WALKS ON Zf 279 


LEMMA 4. LetY —(y,s) € ðQ x Z. Then we have 
G.11)  Px[S, €Sk,]zPx[$5,, EAKrr],  X€Or(¥), 


provided that K > Kọ is large enough. 


PROOF OF THEOREM 4. Theorem 4 is an immediate consequence of the Car- 
leson principe and Lemma 4. Indeed, we may always assume that 
v(Y xr) = u(Y x,) = 1. By Carleson, v < co = co(d, a, T) in Qx,(Y) (which 
contains Dgr r). The constant co can be chosen so that, by Harnack, u > 1/co on 
Skr,r. Let ug = cou and vp = es — uo. Let X € Q,(Y). By Lemma 4 we have 


vo(X) € o* (Ag, (Y) < ©” (S, (Y)) < uo(X) 
and then 


sup Te cô sup (2 T 1) < 2c. 


Q.(Y) u Q.(Y) M0 LJ 


PROOF OF LEMMA 4. To prove the estimate (5.11), it suffices to show that 
if u, v: Dgr, — R satisfy 


Lu =Q, u>Q in Drrvi u > Í on SKr,rs 
(5.12) 
£v zz 0, vxi in Drrr; v<0 on OpDkrr \ Akrr, 


then we have 
(5.13) vsu in Q, = Q, (Y) 


provided that K > Ko is large enough. 
The first step is to prove that under (5.12), u verifies the lower estimate 


dist(x, 92) 
614 — agozz( T), x-065e9Q). 
for appropriate constants 6, y > 0. Let $ = (Ro — 5r) 2%. We assume that 


K > 10. We define 4: Qg. — R by dia 
LU -—0 in Qs, 
&-min(u,1  ond,Q6r0Dxrr, 
u=1 on 9p Qe. V Dxz,r- 


Since u > 0, we have, by the maximum principle, 0 < # < 1 in Qer, and since 
ucl on Sgrr we have u > iu on Q,. Let Z € Bo-(¥) satisfying 


IZ — (Ro — 4) g=] x 1. Let Z = (,s — 4[r]?). Let w be defined 


by w(X) = 1—a(X), X € U, U = Qe. N (Ba. (9) x {s —4r? < k € s}). w vanishes 





280 S. MUSTAPHA 


on QU \ 8j(Bo, (y) x (s - 4r? < k < sj). On the other hand, by the same argument 
as in Lemma 3 we see that wf (aU NV8à(B3, (y) x (s —4r? xk 8s) x c» 0. 
It follows then that w(Z) < 0 supy w, where 0 < 0 < 1. This means that 
I — 4(Z) € 0 « 1 and therefore Z2(Z) > 1 — 0 > 0. Using Harnack and the fact 
that u > ü on Q, we then deduce (5.14). It follows from (5.14) that 
u(X) > 28K 7, X € Qr \ Drr/K, 
where we assume r > Kc and K sufficiently large. Observe that we have in par- 
ticular u(X) > 20K 7", X € Srr/K. 
The second step is to prove that there exists N > 0 such that 
(5.15) v(X) x exp(—N K), X € Dyr. 
Let j = 1,2,... such that 2j + 1 < K and X; such that 
Xj € 9p DQj- prr; sup vs v(X;j) c v(x;j, Sj). 
Daj-i)r,r 
Let Ü = (Ba (xj) x {sj — 8? < k< Sj}) N Dy,,. We have v < 0 
on QpU \ 05 (Bo, (xj) x (sj — 8r? <k< Sj}) and using the fact that 
o (45Ü — 95 (Bo (xj) x (sj - 8r? <k <sj})) zc» 0 
we deduce that 
v(X;) < sup v <0 supv, 
ÜN(B, (xj) x[s;-4r? xkxs;]) Ü 
where 0 « 0 < 1. Hence 
sup uvu<@supv<p sup v, 
DQj-Yyrr Ü Do jer 
where 0 < o < 1. Iterating this estimate we obtain 
l supv < p* sup vxe MF, 
Dr; DOk4 Ayr 
where 2k + 1 < K < 2k + 3. Thus (5.15) is proved. It follows in particular that 
v X«óK Y in D, r provided that K is large enough. 
From the previous considerations it follows that u = Ku > 0in D,,;k and 
uj > lon D,,X D, r/K (that contains $.,/k) and v = K= Qv ~u) < Ky <1 
in D, r (that contains D, ,/k) with v; < 0 on S,,/x. In particular, we have 


KY EET 
Hi — V = Eu —v)z 0, on De, N D, riK-. 


On the other hand, u1, vı satisfy the same assumptions as u, v with r replaced 
by r/K. We can iterate and define uj, vj such that 


KY NJ ee eee ee 
«- v = (=) (u—v)>0 on Dy Ki riki NDrjki ep Kit!» 


INHOMOGENEOUS RANDOM WALKS ON ZZ 281 


j=1,2,..., and consequently 
u—vzO on S(Y) = © D,.iki riki NDyjki kin. 
jz0 
Let now Xo = (xo, € Or C D,,(Y) and let Xo = (Xo,to) where 
Xo € 0&2 satisfies dist(x, ƏN) = |xo — xol. Then Dgr (Xo) C DcK+2)r,r(Y) and 
Skrr (Xo) C S(K+2)r,r(Y). Replacing K with K + 2 in the previous considera- 


tions, we deduce that u > v on S (X o) that contains Xo. This completes the proof 
of Lemma 4. L[] 


5.4. The boundary backward Harnack principle. Let the notation be the same 


as in Theorem 5. Let B, (yo) = B,. First we observe that the Carleson principle in 
combination with the parabolic Harnack principle give 


2 
(5.16) u(x, 1) <C min u, x € B, — B,p. 


8 B. x(r?/Azt <8r?) 
On the other hand, by Harnack 
(5.17) max ucc min u. 
| B. jx {t=[r2/8]} B, xtr?/Azt x8?) 


Since u = 0 on ð B, x N, by (5.16), (5.17) and the maximum principle we get 


max ucc min u. 
B, x ([r?/8]xt x87?) By x{r?/4st<8r?} 
In particular, we have 
(5.18) . max uxc min u. 
Bep xt(r?/8]st 8r?) B. x(r?/Axtx8r?) 


Let now v be another nonnegative caloric function in B, x N such that v vanishes 
on the lateral boundary 0B, x N. Then, there exists C > 0 such that 


u(xo, [r]))v(x, t) < Cv(xo, 4[rP)ucx, t), 
(5.19) 
xo € Bra, (x, t) e By x (r^ x t x 3r?). 


To prove (5.19) we first use a covering argument to cover 3 (B, x (r^ € t < 3r?]) 
with cylinders C; = Be, (yj) x (sj -&?r? x t € sj], j — 1,..., N, where ¢ > 0 is 
chosen sufficiently small and where YY) = (y;, sj) € 9j(B, x (r? x t < 3r?)), and 
apply in each of these C; the boundary Harnack principle to get 


(5.20) u(Y'R)v(x,t) < Coua), E,D EC, 
where yË ) ] Y? ) r > 0, are defined as in Section 2. By Harnack, we have 
(5.21) v(Y72) < Culo, 4[rP), 


(5.22) u (Y) = cu(xo, [r?/4]). 


282 S. MUSTAPHA 


Using (5.18) we deduce from (5.22) that 
(5.23) u(Y J > cu(xo, [r Ê). 


C 


Putting together (5.20), (5.21) and (5.23) we deduce that 


uxo, [r])v(x, t) € Cv(xo, 47 )u(x, t), 
(5.24) 
xo € Bjj2, (x, t) €Cj, j — 1,..., N. 


On the other hand, we have, by Harnack, 

(5.05) w(x,t))Cw(xo4[rP), — r^ztzx3r^ dist(x,8B,) > ôr, 
where 0 < ô < 1/2. Again, by Harnack combined with (5.18) 

(5.26) u(x,t) > cuo Ir),  r?<t<3r?, — dist(x,9B,) > dr 


and (5.19) follows from (5.24)-(5.26) with an appropriate choice of ô > 0. We are 
now able to get the estimate (2.3). We shall use (5.18), (5.19) and a time-shifting 
argument. Let u be as in (2.3) and let v(x, t) = u(x, t + 2[r]^). By (5.19) we have 


u(xo, [r]))u(x, t + 2[r]*) € Cu(xo, 6[r ux, t), 
(5.27) 
(x,t) e B, x (r? x t z3r?), 


With xo € B,/2 fixed. Equation (2.3) follows from (5.27) and the estimate 
u(xo, 6[r ^) < Cu(xo, [rT^), 


which is an immediate consequence of (5.18). This completes the proof of Theo- 
rem 5. 


REFERENCES 


[1] ARONSON, D. G. (1968). Non-negative solutions of linear parabolic equations. Ann. Sci. Norm. 
Super. Pisa 22 607—694. 

[2] AUSCHER, P. and COULHON, T. (1999). Gaussian lower bounds for random walks from elliptic 
regularity. Ann. Inst. H. Poincaré Probab. Statist. 38 605—630. 

[3] Bass, R. F. and BURDZY, K. (1994). The boundary Harnack principle for nondivergence form 
elliptic operators. J. London Math. Soc. 50 157—169. 

[4] BAUMAN, P. E. (1984). Positive solutions of elliptic equations in nondivergence form and their 
adjoints. Ark. Mat. 22 536—565. 

[5] DELMOTTE, T. (1999). Parabolic Harnack inequality and estimates of Markov chains on 
graphs. Rev. Mat. Iberoamericana 15 181—232. 

[6] ESCAURIAZA, L. (2000). Bounds for the fundamental solution of elliptic and parabolic equa- 
tions in nondivergence form. Comm. Partial Differential Equations 25 821—845. 

[7] FABES, E. B., GAROFALO, N. and SALSA, S. (1986). A backward Harnack inequality and 
Fatou theorem for nonnegative solutions of parabolic equations. Illinois J. Math. 30 536— 
565. 

[8] FABES, E. B. and SAFONOV, M. V. (1997). Behavior near the boundary of positive solutions 
of second order parabolic equations. J. Fourier Anal. Appl. 3 871-882. 


INHOMOGENEOUS RANDOM WALKS ON ZZ 283 


[9] FABES, E. B., SAFONOV, M. V. and YUAN, Y. (1999). Behavior near the boundary of positive 

solutions of second order parabolic equations. If. Trans. Amer. Math. Soc. 351 4947-4961. 

[10] FABES, E. B. and STROOCK, D. W. (1986). A new proof of Moser's parabolic Harnack in- 
equality via the old idea of Nash. Arch. Rational Mech. Anal. 96 3277-338. 

[11] GRIGOR’ YAN, A. (1991). The heat equation on noncompact Riemannian manifolds. Mat. Sb. 
182 55-87. [Translation in Russian Acad. Sci. Sb. Math. 72 (1992) 47—771.] 

[12] KOZLov, S. M. (1985). The method of averaging and random walks in inhomogeneous envi- 
ronments. Russian Math. Surveys 40 73—145. 

[13] Kuo, H. J. and TRUDINGER, N. S. (1998). Evolving monotone difference operators on general 
space-time meshes. Duke Math. J. 91 587—607. 

[14] LAWLER, G. F. (1992). Estimates for differences and Harnack inequality for difference oper- 
ators coming from random walks with symmetric, spatially inhomogeneous, increments. 
Proc. London Math. Soc. 63 552—568. 

[15] SAFONOV, M. V. and YUAN, Y. (1999). Doubling properties for second order operators. Ann. 
of Math. (2) 150 313-327. . 

[16] SALOFF-COSTE, L. (1995). Parabolic Harnack inequality for divergence-form second-order 
differential operators. Potential theory and degenerate partial differential operators 
(Parma). Potential Anal. 4 429-467. 

[17] SALOFF-COSTE, L. and HEBISCH, W. (1993). Gaussian estimates for Markov chains and ran- 
dom walks on groups. Ann. Probab. 21 673—709. 

[18] VAROPOULOS, N. TH. (2000). Potential theory in conical domains. II. Math. Proc. Cambridge 
Philos. Soc. 129 301—319. 

[19] VAROPOULOS, N. TH., SALOFF-COSTE, L. and COULHON, T. (1992). Analysis and Geome- 
try on Groups. Cambridge Univ. Press. 


INSTITUT MATHÉMATIQUE DE JUSSIEU 
175 RUE DU CHEVALERET 

75013 PARIS 

FRANCE 

E-MAIL: sam @math.jussieu.fr 


The Annals of Probability 

2006, Vol. 34, No. 1, 284-320 

DOI: 10.1214/009117905000000431 

© Institute of Mathematical Statistics, 2006 


ON THE STRUCTURE OF SOLUTIONS OF ERGODIC 
TYPE BELLMAN EQUATION RELATED 
TO RISK-SENSITIVE CONTROL 


By HIDEHIRO KAISE AND SHUENN-Jvi SHEU! 
Nagoya University and Academia Sinica 


Bellman equations of ergodic type related to risk-sensitive control are 
considered. We treat the case that the nonlinear term is positive quadratic 
form on first-order partial derivatives of solution, which includes linear ex- 
ponential quadratic Gaussian control problem. In this paper we prove that 
the equation in general has multiple solutions. We shall specify the set of 
all the classical solutions and classify the solutions by a global behavior of 
the diffusion process associated with the given solution. The solution associ- 
ated with ergodic diffusion process plays particular role. We shall also prove 
the uniqueness of such solution. Furthermore, the solution which gives us 
ergodicity is stable under perturbation of coefficients. Finally, we have a rep- 
resentation result for the solution corresponding to the ergodic diffusion. 


1. Introduction. We consider the following nonlinear partial differential 
equation: 


(LI)  1D;(aVD;W)--1à7 Dj/WD;W --b- VW--V A inn”, 


or equivalently 
-— pe ger eee ee a i 
b ax) =b (x) + 3D;a" (x), 
where a(x) = [a7(x), a(x) = [4%(x)] are symmetric matrices, 


b(x) = (b! (x), ..., b (x)) is a mapping of RY into RY, and V(x) is a func- 
tion on RY. Here we utilize the notation D; J= 3? /ðxi 0x;, Dj = 0/0x; and the 
summation convention for multiple indexes. We think of a pair (W, A} of func- 
tion W (x) and constant A as a solution of (1.1). Equation (1.1) is called an ergodic 
type Bellman equation. Such equations have been treated in ergodic control prob- 
lems (cf. [1]). In ergodic control problems, à is negative-definite and more general 
forms of (1.1) have been studied under rather general conditions (cf. [2]). On the 
other hand, (1.1) also appears in risk-sensitive problems in infinite time horizon 


Received July 2003; revised August 2004. 
‘Supported by NSC Grant 92-2115-M-001-035. 
AMS 2000 subject classifications. Primary 60G35; secondary 60H30, 93E20. 
Key words and phrases. Ergodic type Bellman equations, risk-senstive control, classification of 
solutions, transience and ergodicity, variational representation. 


284 


SOLUTIONS OF BELLMAN EQUATION 285 


and has been studied under certain conditions (cf. [10, 13, 14, 20]). One of the 
main features of (1.1) in risk-sensitive control is that à might be indefinite. Indeed, 
the following equation is studied in a risk-sensitive control problem (see [10]): 


box NS 
(1.3) 54. DiW + 54" DIWDjW T int LG, zZ) VW + L(x, z} =A 
r4 


where f:RY x Z —> RN, L:RY x Z — R, Z is a Borel subset in RV and @ is 
a constant in R\{0} which is called a risk-sensitive parameter. Equation (1.3) is 
considered to characterize a logarithm-exponential type criterion per unit time on 
infinite time: 


| | E 1 
(1.4) ^ A=infliminf — log Ele’ fo L(Xe 20) dt]. 

where {X;} is a controlled diffusion process satisfying 

dX, = f(X,z)dt--o(X)dB, — Xo-xeR", — a! (x) (co)! (x), 


{B;} is standard Brownian motion and {z;} is a Z-valued process which is con- 
sidered as a control. The infimum in (1.4) is taken over some class of {z;}. In 
particular, if we take f(x,z) = b(x) + C(x)z, L(x, z) = V(x) + (1/2)zT S(x)z, 
Z = R”, where C(x), S(x) are matrices with suitable dimension and S(x) is 
positive-definite, then (1.3) reads 


5a) DW +4(0a— CS 1 CT) DIWD;W -- b- VW 4 V — A. 


Note that the sign of nonlinear term à = 0a.— CS-!C? depends on 0. We also 
remark that the infimum in (1.3) is attained at z = —S(x) 1 C(x)* VW (x). We are 
concerned with the case that 6a — CS~!C? is positive-definite since in this paper 
we shall study the solutions of (1.1) in the case that à is positive-definite. Recently, 
it has also become known that this case happens in some investment problems 
in mathematical finance (cf. [3, 8, 11, 12, 21]). However, we remark that, unlike 
these papers, the verification theorem will not be considered in this paper. The 
verification theorem is to show A in (1.4) is equal to A* in Theorem 2.6 and 


z-—8(X) cx)! VW*(X;) 


is a feedback optimal control, W* is a solution corresponding to A*,[W* is usu- 
ally unique if W*(0) — 0]. See also [15] for some examples from investment prob- 
lems. The relation between the drift term aV W* in (1.8) for W = W* and z? 
as well as its role in the risk-sensitive control problem can be seen from the ar- 
guments in [11, 12]. The main merit of our study is to show that multiple solu- 
tions exist in general for such equations. We also provide particular solution(s) 
that is(are) responsible for the verification theorem. We observe that the case when 
0a — CS-1CT is negative-definite can also be treated by considering the equation 
for (—W). Therefore, according to Theorem 2.6, we have the following interesting 
observation. Assume c4 € a(x) € co and c1 < C(x)$(x)-! C(x)? < cz for some 


286 H. KAISE AND S.-J. SHEU 


constants c1, c2 > 0. Then for small 0 > 0, there is A* (depending on 0) such that 
the above equation has solution if and only if A x A*. For large 0 > 0, there is A* 
(depending on @) such that the above equation has solution if and only if A > A*. 
In a risk-sensitive control problem, it is more interesting to assume V (x) — oo as 
|x| — oo. In this case, it may happen that A* = oo for large 0. See some discussion 
in [20]. 

As we mentioned in the above, the studies of solutions for Bellman equations 
from an analytical point of view are considered to be fundamental to determine an 
optimal control. Note that solutions of (1.1) have ambiguity of additive constant, 
that is, if (W, A) is a solution of (1.1), W(x) + c still satisfies (1.1) for each con- 
stant c. As some examples show, it is known that (1.1) has multiple solutions even 
if we identify the solutions up to additive constants. So, it is important to study 
how we pick up a particular solution of (1.1) which gives an optimal control for 
the problems at hand. A common way to obtain a particular solution for ergodic 
type Bellman equations is to study the discounted type equations. The discounted 
type Bellman equation corresponding to (1.1) is as follows: 


jDi(a" Dj Wa) + 54 DiW, Dj Wa +b-VWytV=awy. 


a > 0 is called a discount factor. Under certain conditions, it is shown that 
Wa(x) — Wa(xo) normalized at some point xo € IR" and œ Wa converge to some 
function W(x) and some constant A, respectively. Furthermore (W, A) satis- 
fies (1.1) (cf. [10, 13, 14]). Under the conditions including the linear exponential 
quadratic Gaussian (LEQG) control problem, we need to consider the case that 
b(x) [resp. V(x)] is at most linearly growing (resp. quadratically growing). Un- 
der such settings, W is characterized to meet some growth condition and (W, A) 
obtained by this procedure is considered to be the right solution (cf. [13, 14]). 

In the present paper we directly tackle (1.1) without the procedure using the 
discounted type equation under the conditions including the LEQG case. We shall 
specify the set of A for which (1.1) has a smooth solution. Furthermore we shall 
characterize the set of A by noting the global behavior of diffusion process which 
is related to some control problem. 

To explain how we relate (1.1) to a control problem, we shall give a control 
interpretation to (1.1). Let (82, F, P, {¥;}) be a probability space with filtration. 
Consider the following controlled stochastic differential equation (SDE): 


dX, = (E(X) + up) dt -o(X)dB,, |— Xp =x ER”, o(x) = a(x), 


where {B;} is N-dimensional {¥,}-Brownian motion and {ur} is an 
(Z;)-progressively measurable process taking its value in RY. {ur} is considered 
as control process. We define the value function as follows: 


T-t "EN 
v(t, x) — sup E, n (V(X;)— 2à;; (Xs)uiu]) 4|. 


SOLUTIONS OF BELLMAN EQUATION 287 


where ài; l is the (i, j)-component in inverse of a. By using the Bellman principle, 
we see that u(t, x) satisfies the following equation formally: 


au l; : NE 
di —a? Dijv + sup {bce Tu): Vxv — aj ieu +V=0 
ot 2 ucRN 2 
(1.5) 
in (0, T) x RN, 
(1.6) v(T,x)-0,  xeR. 


Since supyegw((P + u) - Vxv — (1/2); uiui} = (1/2)44 DivDjv + b - Veo, 
(1.5) reduces to the following: 

dv ]l lai; 2 

— + za" Dijv4- -à" DivDjv +b- Vyv + V — 0. 

ot 2 2 | 
Note that the supremum is attained at u(t, x) = à(x)Vyv(t, x). If —(dv/dt)(0, x) 
converges to some constant A and v(0, x) — v(0, xo) normalized at some point 
xo € RY converges to some function W(x) as T — oo, we have formally the fol- 
lowing equation which we shall discuss in this paper: 


Ia" Di W + ia! Dj7WD;W --b- VW - V =A. 


This is considered to characterize the long-time average cost defined as following: 


1 T 1 eas 

(1.7) A == sup lim sup — E; | (va — -âr orm) 4s. 
u., T—-0o T 0 2 M 

Following the Bellman principle, we can expect that 4; = 4(X;)VW(X;) should 

be a candidate of optimal control for (1.7), where {X;} is defined by the controlled 

SDE with u; = ü; —à(X)) VW(Xy): 


(1.8) dX, = (D(X) +AVW(X;)) dt -o(X)dB,  Xo=x. 


We shall study the structure of solutions of (1.1) by relating to (1.8) under condi- 
tions which include the LEQG case, that is, b(x) [resp. V (x)] has at most linear 
growth (resp. quadratic growth). 

The paper is organized as follows. 

In Section 2 we shall specify the set of A for which (1.1) has a solution under 
rather general conditions on b(x) and V (x). Indeed, it is proved that the set of A is 
equal to closed half-line [A*, oo) for some A* € (—oo, oo). 

In Section 3 we shall classify A according to the global property of the dif- 
fusion process defined by (1.8). We shall prove that for A > A*, the diffu- 
sion process (X;) in (1.8) corresponding to solution (W, A) is transient and 
for A = A*, {X;} is ergodic. Moreover, we shall show that solution W(x) cor- 
responding to A* is unique up to additive constant. 

We show the structure of solutions in Sections 2 and 3. In Section 4 we shall 
consider the problem that the structures specified in Sections 2 and 3 are preserved 


288 H. KAISE AND S.-J. SHEU 


under the perturbation on coefficients in (1.1). More precisely, consider (1.1) with 
a = An, â = ân, b = bn, V = Vs: 


1 Di (al) DjW,) + 44 Dj/W, DjW, +b- VW, + V, = An. 


In similar ways to Sections 2 and 3 we can find [A7, oo) for (1.1) parameterized 
by n and solution W, corresponding to A7 is unique. In Section 4 we mainly study 
the case that a, = a, à, = à, by = b, independent of n, and shall show that if V, 
converges to V, Až converges to A* and unique solution W, corresponding to A? 
converges to unique solution corresponding to A*. 

In Section 5 we shall study the representation for A*. To obtain the representa- 
tion result, we consider perturbation on V and notice the dependence on V for A*. 
By using the representation, we can prove the moment condition for invariant mea- 
sure of the ergodic diffusion process in (1.8) corresponding to A*. 

Last, we mention the connection to positive solutions of linear equations. Sup- 
pose ař (x) = à! (x), i, j =1,..., N. If we take the transformation $ (x) =e” ®©) 
in (1.1), we have 


(1.9) la" Dij +b. Vo - Vó — ^6. 


Thus, in the case that a^ (x) = à" (x), the study of solutions for (1.1) reduces to 
that of positive solutions for (1.9). We note that the structure of A specified in 
this paper is considered to be a generalization in the theory of positive harmonic 
function for linear differential operators (cf. [22]). Some applications of our results 
to the evaluation of large time asymptotics of expectations of diffusion processes 
will be given in [17]. 


2. Set of A with solutions. In the present section we shall consider the set 
of A for which (1.1) has a classical solution W under rather general conditions. 
In the next section we shall classify A by following the global behavior of the 
diffusion process related to the solution W corresponding to A. 

We define the following set: 


(2.1) A={A: there exists smooth function W satisfying (1.1) for A}. 


Under the assumptions given below, we can prove that Æ has the following form 
for some A* € (—oo, oo): | 


oh = [A*, 00). 


For simplicity, we always assume a"! , à", b, V are sufficiently smooth. We shall 
give the following assumptions: 


(Al) There exist 0 < vj < v, such that 


vile? cat kE < lE? — Vx,EeRN. 


SOLUTIONS OF BELLMAN EQUATION 289 


(A2) There exist 0 < ui < u2 such that 
ue? <4 hkj sub ^ Vx,EeR". 
(A3) There exists a smooth function Wo(x) such that 
5 D;(a Dj Wo) + 54 DjWoD;Wo--b- VWo--V — —oo as |x| oc. 
REMARK 2.1. Note that it follows from (A1), (A2) that there exist c, c > 0 
such that 


(2.2) ca(x) x á(x) < ca(x), x e RN. 


REMARK 2.2. Inthe following, we give some interesting examples for (A3). 


(a) Assume aj; (x), àjj (x) are bounded together with their derivatives. Assume 
also that there are co, ro > 0 such that 


b(x):xx-—colxl^, |x| x ro. 


(b) Assume aj; (x), âi; (x) are bounded together with their derivatives. Assume 
also that there are co, ro > O such that 


b(x)-x>colx|*, jx] > ro. 
(c) Assume V (x) — —oo as |x| — oo. 


For (a), we take Wo(x) = c|x|? for small c > 0. For (b), we take Wo(x) = —c|x|l? 
for small c > 0. For (c), we take Wo = 0. 


REMARK 2.3. For the purpose of discussion in the present section, we can 
replace (A3) with the existence of a super solution of (1.1) for some A to ensure 
that A # Ø. We need (A3) to classify A in the next section. In the following, we 
show for any A and R > 0, we can construct a subsolution of (1.2) in Bg with 
boundary value Wo on 3B g, where Bg is open ball with radius R centered at 0. 
Indeed, we consider linear partial differential equation with Dirichlet boundary 
condition: | 

2 Di(a DjWo) -b- VWo--V =A in Bg, 
Wo(x) = Wo(x) on ðB}. 


Under (Al) and smoothness of coefficients, we have unique solution 
Wo € C^* (Bg) N C(Bg). By (A2), we have 


3Di(a" D; Wo) + 347 DjWoD;Wo--b- VWo--V ZA — in Bg. 


290 H. KAISE AND S.-J. SHEU 


In order to see “A £ Ø, consider the following Dirichlet problem: 
(2.3) 5D;(a DjWr) + 54 D/WRD;Wn b: VWg cV — A — inBmg, 
(2.4) Wr= Wo on ð BR, 


where Br is open ball with radius R centered at 0 and Wo is taken from (A3). Note 
that (2.3) is equivalent to 


(2.5) ja" Dj Wg -- 4a D/WRD;Wg --b- VWg--V — A — in Bg. 
By (A3), Wo satisfies the following inequality for some A: 
5 Di (a Dj Wo) + 54) DjWoD;Wo--b- VWo--V <A inR*. 


Also, from Remark 2.3, we see that for R > R, there exists a smooth func- 
tion Wo(x) such that 


1 D; (a Dj Wo) + 547 D;WoD;Wo +b. VWo +V >A in Bs. 


Then, under (A1)-(A3), there exists Wr € C^" (Bg) satisfying (2.3) and (2.4) 
(cf. [18], Chapter 4, Theorem 8.4). 

We need a uniform bound for VWg on compact sets to obtain a solution W 
of (1.1) by sending the radius R to oo. The following gradient estimate is also 
useful in the later discussions. 


LEMMA 2.4. Let Wr be a smooth function satisfying (2.3) in Bg. Under (A1) 
and (A2), we have for each r > Q and R > 2r 


(2.6) sup |VWg]? < C, + CA, 
B, 


where C is a nonnegative constant independent of r and R, and C, is a constant 
depending only on r. 


PROOF. Equation (1.1) has the nonlinear term similar to those treated in 
[13, 14] and we can follow the same arguments to obtain the gradient estimate. 
However, we shall give a proof to specify the dependence of A. 

We set W = Wr for simplicity. By differentiating each side of (2.5) on xg, we 
have 


T 5 Dya'! DijW + 3a Dij W + 4 Dea! DjWD;W 
| +â D/W Dj W + Db DiW + P! DiW + DV =0. 


SOLUTIONS OF BELLMAN EQUATION 291 


Let us set G = (1/2) Y, (D,W)?. Then, using (2.7) 
—ia' DjjG — à" DjWD;G — b DiG 
= —1a" DW Dij W — ża" Dri W Dy W 
(2.8) — à! DiW DW Dj W — b' DiW DiW 
= 1 Da DiW DjjW + 3 Dâ" DiW DjW DW 
+ Db DiW DiW + DV DiW — ża" Dri W Dy W. 


We note the second-order derivative terms. Then, we have 


RHS of (2.8) < — Dat ? IDW + -D° W]? 
of (2.8) p (iot) 4 d | 


1 
+ ; xà" D; iWD;WD,W + Db! DiW DW 
1 5 l] .: 
+ DV DW — 14 Dki W Dk; W — 14 Du W Dg; W 


] 
T a (Eie Jiowe + 5 Dea! DiW DjW DW 


- 1. 
+ Db DiW DW + DV D4W — 1^ Dri W Dg; W 


where 6 > 0 is a small constant. Indeed, we can take & satisfying ô < vi. From 
matrix inequality (tr AB)* < Nvy(tr AB?) where A, B are N x N-symmetric ma- 
trices, A is nonnegative-definite and vz is the maximum eigenvalue of A, we finally 
obtain the following inequality: 


MEET T iis 
—-a! Dj G — à" DjWD;G —b' DjG 
(2.9) 
< C,IDW|-- C,|DWE? +C,|DWP — 





1 E 
m (aÏ Dj W))? ^ in B». 


Here and in the proof below, we suppose that C, is constant depending only on r 
and C is nonnegative constant independent of r and R. 
Fix arbitrary & € B, and take a cut-off function o € CSO (RM) satisfying the 
following: 
O<g<linR”, g€)=1, | z0inB,(, 
(2.10) 
Vol Ce", — |D^g| & C, 


where B,(&) is open ball with radius r centered at £. Let xo be a maximum point 


202 H. KAISE AND S.-J. SHEU 
of eG in B, (£). By the maximum principle, we can see 
0 < —$a" Dij (oG) — à" DiW Dj (eG) — Ë Di (G) 
= e(- ia" Dij G — à? DiW D;G — b DiG} 
(2.11) — la" (Dijg)G — à! DigD;G — à? Dj @(D; W)G — P (Dig)G 
< o[-3a" DjjG — à" DjWD;G — b DiG} 
-C.G-Col?G97 a xo, 


where we used 0 = D(o9G) = GDq + 9 DG and (2.10). From (2.5) and (2.9), it is 
implied that 





] " 
RHS of (2.11) < e|c e^ + C,+G + C,G?/? ae as (ali y WY 
v2 
t-C,G + Co^ Gg??? 
(2.12) 5 {Cai £C.6+4C6,G3? 


1 Wee i 2 
— ——| ——-à" DW D;W —b'D;W - V A) 
Fal f d i i 


+C,G + C92 G3? at Xo. 
By (A2), the following inequalities hold for some positive constant « which de- 
pends on £1, 
1... d 
-530" DIWD;W —bPDiW—V--A^ 
(2.13) «DW - C|DW| - V A 


« —k|DWUÜ -- C, — V 4- A. 
In the case that —&| DW]? + C, — V + A > 0 at xo, we have 
K|DW|[^(xo) x C» — V(xo) FA x Cy +A, 
where we used xo € Br. Since 
5|DWI°(E) = 3IDWI^&)e(£) < GGo)e (xo), 
we obtain the following gradient estimate at £: 
«|DW|^(£) x C, +A. 
We next consider the case that 


—k|DW|--C,—V--A-xO0 atxo. 


SOLUTIONS OF BELLMAN EQUATION 293 


By (2.13), 
RHS of (2.12) 


1 
< {cai +O GCG) = wy C*IDWI doo Ve ay 


(2.14) t-C,G + Col G?? 
1/2 3/2 4x? 2, 4k 
Nw Nv 
+C,G + CoG’, 


If C, — V + A > kG(xo)/4 or C, > G(xo) we have the bound [DW] (E) < 


Eai 


C, + CA in the same way as the above case. We shall consider the case that 
C, — V + A xkG(xo)/A and C, < G(xo). Then, from (2.14), we have 


0< 01a? +€,G+C,G?? — ig! $ z e) 
+C,G + CoG? 
< —Ci9G? + C49 G** + C4C,G 
= —Ci9G^ + Cog G"? E CSG at xo, Cz = C3C,, 


where C4, C2, C3 are positive constants independent of r, R and A. By setting 
X = gl"? (xy)Gl? (xg), we have 
0 —C1X? + C5X 4- Cs. 
Therefore, we have 
C2 2€4 C2 20C 
x? = G < ber FEE 2 Ee a 
9 UE C rs n C; 
Since (1/2|DW|*(£) = (1/2]DW|*(£)g^(&£) < G(xo)g(xo), we obtain the 
bound for |DW|(é). O 


REMARK 2.5. Under some growth conditions for coefficients of (1.2), we 
can obtain growth order for gradient of solutions. For instance, besides (A1), (A2), 
suppose the following growth conditions: Da (x), Dâ" (x) are bounded and there 
exist c1, Co > 0 and m > 1 such that 


|b(x)| x e + Jx”), IDb(x)| < cQ + Ix|"75, 
VEO € ea 1x, IDV € e0 + Ix 7"). 
Then, in Lemma 2.4, we can take C, = C(1 + ym). that 1s, 
sup|VWal" 5 CQ +r" + A), 0 «2r « R. 


294 H. KAISE AND S.-J. SHEU 


We may normalize Wr as Wr(0) = 0 because (1.1) does not include a zeroth 
term on Wr. Then, from Lemma 2.4, there exists W € C(RY) such that Wr con- 
verges to W on each compact set as R — oo by taking a subsequence if necessary. 
Also, since (Wg] ^2, is bounded in H'(B,) by Lemma 2.4, Wer converges to 
W Li. -strongly and H,\,,-weakly. Furthermore, we can see that VWr converges 
Li.,.-strongly in a similar way to Lemma 2.8 in [14] and Section 1.4 in [20]. 


We rewrite (2.3), (2.4) in integral form: 
-4 | ai DiWgDjpdx + } | a DiWrDjWrodx 
+ | b-VWrodx+ [ Vodx= | Agar, 9 € Co (Bg). 


Fix r > 0. Since Wr converges to W H,.-strongly, we obtain the following by 
sending R to oo: 


-} | ai D,wDjpdx+ 1 [a DiWD;Wodx 


[5 vWoax [ Vødx= | Apax, p € Cg (Bj), r » 0. 


Owing to the regularity theorem of elliptic equations and the imbedding theorem, 
we have W as a classical solution of (1.1). Therefore, we have proved that A £ Ø. 
We shall state and prove the form of the set of A. 


THEOREM 2.6. Under the assumptions (A1)-(A3), there exists A* € 
(—o0o, oo) such that A = [A*, oo). 


PROOF. In order to show inf A > —oo, we suppose inf A = —oo, that is, there 
exists {An} C Æ such that A, tends to —oo as n — oo. Let W, be a solution 
of (1.1) corresponding to An. Then, by the integral form of (1.1), we have 


-4 [ 4! DiW,Djodx +} | à DiW, DjW,odx 
(2.15) 
+ [b-VWredx + [ vods — | Anpdx, e E CPR”). 


Take 9 € C5° (RY) such that f pdx 40. Since {Ay} is bounded from above, it is 
implied from Lemma 2.4 that 


(2.16) sup VW,| x Cr, 
B, 


where C, is a constant independent of n and r is taken such that supo C Bp. 
Therefore, the left-hand side of (2.15) is bounded on n. On the other hand, the 


SOLUTIONS OF BELLMAN EQUATION 295 


right-hand side of (2.15) is unbounded because of the assumption which we made 
above. This leads to a contradiction. 

We shall next prove if A € A, then [A, 00) C A. Let W be a solution corre- 
sponding to A. For arbitrary A > A, we have 


(2.17) iDi(a D; W) + 1a" DjWD;W b. VW--V — ASA — inR”. 
By Remark 2.3, for R > R,there exists Wo such that 
(2.18) Dila Dj Wo) + 54) Dj/Wo9DjWo--b- VWo--V 2 A — in Bg. 


Consider the Dirichlet problem (2.3) with boundary condition Wr = Wo on 3Bg. 
From (2.17), (2.18), the existence of a classical solution for this Dirichlet problem 
is guaranteed by Theorem 8.4, Chapter 4 in [18]. In the same manner as that right 
after the proof of Lemma 2.4, we can see that there exists a smooth function W 
satisfying (1.1) for A. 

We shall prove that A* = inf A actually belongs to A. {A,} is a sequence in A 
such that A, — A* and W, is a solution of (1.1) corresponding to A, normalized 
as W, (0) = 0. Then, W, satisfies (2.15). Since {A,,} is bounded, it follows from 
Lemma 2.4 that (2.16) holds for some constant C, independent of n. Following 
the same way as the discussion after Lemma 2.4, we can see that a sequence of W,, 
converges to W* e C(R”) uniformly on compact sets and Hj. -strongly. By taking 
a limit in (2.15) as n — oo, we have 


-} [a DiW*Djødx +4 | abiw* DjW* eas 
+ | 6v W*oax e [ Vodx= | Vds Vo e COR’). 


Therefore, the existence of a classical solution W* of (1.1) for A* follows from 
the regularity theorem of elliptic equations (see Theorems 5.1, 6.3, Chapter 4 
in[18]. LJ 


3. Classification of solutions. 


3.1. Transience and ergodicity of diffusion processes. In the last section we 
proved that the set of A for which (1.1) has a smooth solution is A = [A*, co) for 
some A* € (—oo, co). In the present section we shall study the classification of A 
by global behavior of {X,} defined by (1.8). 

Let (2, F, P, (7;]) be a filtered probability space on which N-dimensional 
Brownian motion {B;} is defined. For given A € [A*, oo), consider the SDE: 


(3.1) dX, = (b(X)) --àVW(X)))dt--o(X)dB,, | Xo-x, 


where W(x) is a solution of (1.1) corresponding to A. We shall classify A ac- 
cording to the global properties of {X;}. More precisely, we shall prove that for 


206 H. KAISE AND S.-J. SHEU 


A > A*, {X;} is transient and for A = A*, {X;} is ergodic. Note that solution of 
(3.1) might explode in finite time. 

We shall next discuss transience of {X;} for A € (A*, ds We introduce the 
operator associated to solution (W, A) of (1.1): 


TA f(x) = Exlf(Xpit < tol, f ECR”), 
t, = inf{t; X, € B4(0)], Too = lim Tn, 


where {X;} is a solution of (3.1) up to £ < To, corresponding to (W, A). 


LEMMA 3.1. Under (A1)-(A3), the following inequality holds for each solu- 
tion (W, A) of (1.1): 


T^^ Fox) < keh AY, — fFeCq(RP, f =, 


where c is in Remark 2.1 and k(x) is a constant depending only on x. 


PROOF. Let W* be a solution of (1.1) corresponding to A*. We set W; = cW, 
* = cW*, where c > 0 is taken from Remark 2.1. Then, we have from (1.2) 


l NT , 
ls a " 
(3.3) 54" DijW + 7- 7 Dj7W DW? +b. VW? --cV —cA*. | 
Subtracting (3.3) from (3.2), 
E "EM 
54! Dij (We — We) + (b+ àVW*)  V(W, — We) 
l A * x * 
T 3:4 VU — W-V(W.— Wr) =c(A — A^). 
Setting W = W, — W*, we have 
"EC "CUBE MM ; 
(3.4) jai DijW 4- (b -àaVW^). VW + 3,0 VW . VW =c(A — A*). 
We consider (3.1) and rewrite this as follows: : 


dX, = (b(X;) +aVW(X,)) dt +o(X;) dB, 
= (b(X;) +a4VW(X;)) dt — aVW(X;) dt +a(X:) dB, + aVW(X,)dt 


D 1 2 s z 


where 


ec s S 
(3.5) Bs; = Bs + Í oVW(X,)dr, S < Too. 


SOLUTIONS OF BELLMAN EQUATION 297 


Define measure P on F,” = F; An 
dP 
dP 





t B 1 ft = x 
-exp|- | oVW(Xs)1is<x,)dBs — = | aV NW QCG36 «s, ds |; 
£O 0 E 2 Jo S 


Then, P is probability measure and (Bant )s = (B.)s =S ^ Tn. By (3.5), we 
have 


E, f (X1); t < Tp] 


(3.6) = E, LA Xe c VW(X,)14525,) GBs +(1/2) fg aVW.-VWOG)t ss) ds. < Tn] 
= Ë [| f (Xe v VW Asm) dB,- (0/2) fj aVW-VW (Xss ds; , < ty], 


where E, denotes expectation with respect to P. Applying the Itô formula 
to W(X;), 


» 2 m 1 m x 
dW(X,) 2 VW. (5+ àVW* + -àvW — av) (X,) dt 
jul g : 
+ z2 Dj W(X;) dt --oVW(X,) d B, 


I = x is 
= (52! Dij W + (b+ aVW*) - vi) (X1) dt 


jade ut — DOE ENT 
a (<avw . VW — avW via) dt +oVW(X;) dB; 

(3.7) 

= (-zàvw . VW +c(A — A?) ato dt 


NETS — Es , 
" - 1 = * 
= oVW(X;) d B, S so ; VW (X;) dt 


by kh. z = 
+ 5 (za — a)v -VW(X,) dt+c(A — A*)dt. 
Here we used (3.4). Then, by (3.6) and (3.7), we have 
Elf (Xi): t < Tn] 


= É,[f(X,)e- A -^0H-WQt) WG)40/2 fo(a- 0/08) VW-VWOGYMs sq ds. 


t < Tn] 
< IF go ep UV 00— W C) yesupp fi 


x e € ^-A9t E, [e (1/2 fo(a- 1/08) V ÜVÜ Xs) Lss) ds. t< Ta]. 


298 | H. KAISE AND S.-J. SHEU 


Since ca(x) < a(x), we have 


EfGC)ureulnsküoe 6 49. 
ke) = If loo exp( sup (Wo) — We»). 
yesupp f 


Taking the limit as n — oo, we obtain 


EL fAStete lake en o 
Now we have the result on transience. 


THEOREM 3.2. Let (W, A) be a solution of (1.1) and {X;} be a solution of 
(3.1) corresponding to (W, A). If (&1)-(A3) hold, then for A > A*, {X;} is tran- 
sient. 


PROOF. Let f € Co(R^) and f > 0. Since A > A*, we can see that by 
Lemma 3.1, 


OO 
| T^^ £(x) dt < oo. 
0 
Therefore, {X+} is transient. C] 


We proved that for A > A*, {X;} defined by (3.1) is transient. We next show that 
if A = A*, the corresponding diffusion process (X7) satisfying (3.1) is ergodic. 
We have to show the following proposition. 


PROPOSITION 3.3. Let (W, A) bea solution of (1.1) and let {X;} be the cor- 
responding diffusion process defined by (3.1). Assume (A1)-(A3). If (X) is tran- 
sient, then there exists a > O such that 


T'"^f(x)zCQ) ", feCo(R®), f -0,xeRP, 
where C (x) is a constant independent of t, but depending on x. 
We prepare several lemmas to prove the above proposition. 


Let (W, A) be a solution of (1.1) and {X;} be a solution of (3.1). We define 
occupation measure for (X;) on {t < Too} as follows: 


1 t 
m lj 19(%)ds, BE BR"), t< to, 


where 48 (IR^) is the Borel o-field on R”. Let , (IR) be the set of probability 

measures on BIR”). We think of M,(R%) as the topological vector space with 

topology compatible to weak convergence. Note that u; € Mı (RY) on (t < To}. 
The following lemma on large deviation type estimate is useful. 


SOLUTIONS OF BELLMAN EQUATION 299 


LEMMA 3.4. Let {X;} be a solution of (3.1). Then, the following estimate 
holds: 


I 
limsup - log P[j € X,t < too] < — inf I" (u) 
t— 00 
(3.8) 
X is compact set in M4 (R^). 


IV (u) is defined as follows: 
L - M 
IV (y) =- nf f — (Du), = ja” Dij + (b--àvW)- V, 
uc 
U = {u £ C*(RN ):u(x) > 0 for all x, Lu/u is bounded above}. 


Note that 7” (jz) takes values on [0, oo] and is convex, lower semi-continuous 
‘on M1(RP). This type of estimate is well known in large deviation theory. As 
noted in [5], even if the state space of {X;} is not compact, (3.8) holds for compact 
set K (cf. comments in Section 7, page 440 of [5] and see the proof in Section 2.2 
of [4] for Brownian motion). 


LEMMA 3.5. If] Wu*) =0 for some u* € M,(RY), then diffusion process 
{X;} defined in (3.1) does not explode in finite time. 


PROOF. From assumption IW (u*) = 0, it is implied that 
L 
(3.9) J aw zo Vu c V. 
u 


For u € TE u > 0, and constant c > 0, 


T 


T iute, 
; [8^ MEL: 
dt a 


= N WA du” 
mee E T, ute 


Since T ^^y +c € U, we have 

d pg go 

— | log ————— — du* > 0 Vt. 

zJ 5 u +c S 
Thus, we can see that 

cer A 
[o= Eau" [og E fy Ute ay 

(3.10) 





Sfl 
[pe 


300 H. KAISE AND S.-J. SHEU 


Let {bn (x)}P~, be a sequence in Cc? (RY) such that 0 « $,(x) x 1 and p(x) T 
as n — oo. If we take u = $, in (3.10), then we have 


pus 
fios ] T RT ay? 20 
n 


Since 7, ^g, < T, ^1, 


qp au 
log = a 0. 
| o au" > 


By taking the limit as n — oo, we obtain 


pne 
l 
-fit LLL dp z0. 


Noting that (nod T c)/(1 +c) <1, we can see that 
L"^1-1, pras. 
Since the diffusion process is nondegenerate [see (A1)], 
L^^x)j21  VxeRP,vrz0. 
Finally, as t — oo, we have 
Px {too = 00] = lim Py[t < too] = lim T/ ^1) = 1. 0 


LEMMA 3.6. Let {X;} be a solution of (3.1). If IV (u*) = 0, then u* is an 
invariant measure for (X4). 


PROOF. Since I" (u*) = —infyey f (Lu/u)(x)u* (dx) =0, 
[eure >0 Vue U. 
Setting w = logu, we have 
G1) — f (Lw(x) + faVw-Vwe))u*dx)20, wae” ew. 
Denote U by 
U={ueC7(R*):3R>r>O0st.r<u(x) <R, 


Du, D*u have compact support}. 


Note that Ù c U. It is easy to see that if u = e" € t, then uw, = e^" e U for 
À € R. Therefore, applying Aw in (3.11) instead of w, 


À " 
f (z» 4- j4Vw ; VwG) )u* (dx) >0, u — e" eU, O0. 


SOLUTIONS OF BELLMAN EQUATION 301 
Taking the limit as A —.0, we have 
| Lw(x)u*(dx)>0, u=” EU. 
Since u = e" € U implies u.; =e” € U, we obtain the following equation: 
J tuoi dozo veret 


Noting that C° (R^) is included in (w: u = e" € 11), * satisfies the following 
partial differential equation in distributional sense: 
L'u* =0 in RY, 


where L* is a formal adjoint of L. Since we assumed the coefficients of L are 
sufficiently smooth, z* has a density p*(x) and p* satisfies. 


L*p*=0  inRN. 


Here we recall that diffusion process {X,} does not explode in finite time because 
of Lemma 3.5. Then, by slight modifications of the Theorem in page 243 of [24] 
to the case that the second-order term of L is divergence form, u* (dx) — p*(x)dx 
is actually an invariant measure. L 


PROOF OF PROPOSITION 3.3. Let us define Up as follows: 
Uo(x) = — (5a Djj Wo + JàAVWo - VWo +b- VWo + V), 
where we take Wo from (A3). By setting Wo,- = cWo and W; = cW, we have 


P 1. " 
~a” Dij Wo.c + —aV Wo. . V Wo.c + b M V Woe + cV — —cUp, 
2c 
(3.12) 
54 DijW. + âV We: VW. +b-VW.+ceV —cA, 


where c is in Remark 2.1. In the above equations, subtracting each side of the 
equations, 


ln "IM 
24 Dij (Wo,c — We) + (b--à3VW) - (VWo,c — V We) 
T 
T AW Wo,c — V Wo) (VWo,c — Y We) = —c(Uo + A). 
Define $ as $ = eWoc—We, Then, we have 


(3.13) za! Dij -- (b-- àVW)- Vó4- (a —ca)Vó- và = —c(Ug 4- A)ó. 


302 H. KAISE AND S.-J. SHEU 


Let {X;} be a solution of (3.1). By the Itó formula and (3.13), 
d(b(X;)e «UoX5)-)45) 


| e z - i E t 
- |; Dijó + (b-- àVW)- Vó + c(Uo + AX. P DUI 
-- o V(X e/o UolXs)+A) ds 3 p, 


= -— E (à — ca)Vó ; v6 ( Xe c(Up(Xs)+A) ds 3, 
C 


d oVO(X; ef c(Up(X5)+A) ds d B,. 


Since ca(x) < a(x) and $ > 0, we obtain 
= wn t — $ 
(3.14) b(X elo «UoOG)- ds < E(x) + Í o Vé(X,)elo UXA dr qp. 


By using stopping time £ A Ta in (3.14), 


t^TR 


E, [O(Xerz, )e 0 
Then, as n — oo, we have 
(3.15) E, [6 (X;)e/ Uo) +A ds: t < t] < P(x). 
Let Cm be a subset in M, (IR) defined as follows: 
Cm (ue Mi ):u(B)zi1-&VizmL m21, 
where {6;} is a sequence such that 6; — O and determined later. Note that Cm 
is a relative compact set in Mi (RY) because Cm is tight. From the definition 
of Twa f, 
T,^^ f(x) = Exif (Xi); t € Cm, t < too] 
+ ExLf Xr); Ht € Cm, t < too] 
< |f loo Px Ii: € Cm, £ < too] 
(3.16) + Exlf(X1)i be € Cn t < Tool, 
< If loo Px [At € Cm, t < Too] 
+ fb loo Ex (Xi); He € Om, t < Tool, 
f € CoR”), f 20. 


We shall prove that E; [p(X t); Me € Cm, É < Too] decays exponentially as 
t — oo. On the event (u; ¢ Cm, t < Too}, there exists | > m such that 


C(Uo(X$)43- A) us « p(x). 


1 t 
(3.17) an) 7 [ 18004: < 1-5 


SOLUTIONS OF BELLMAN EQUATION 303 
which is equivalent to 
(3.18) (BS) = - [ 1e (X,) ds > ô. 
Then, we have on (p; ¢ Cm, t < Too} 


[ c(Uo(X;) + A) ds = f c(Uo(X;) + A) Lg, (Xs) ds 
t 
+ [, (Us) + A) tg (X) ds 
(3.19) > infc(Uo(x) + A) i ag (X) ds 


t 
+ inf (U(x) + A) | 1pc (X4) ds 
Ix D. 


= fou: (By)t + Bra (Bp )t, 


where we set fo = inf; c(Uo(x) + A), Bi = infixj>; c(Uo(x) + A). By (A3), there 
exists m > 1 such that 


(3.20) pi» 0 Vl 2m. 
So, we obtain from (3.17)-(3.19), 


H 
| c(Uo(X;) + A) ds  (—IBol(1 — 8) + Biàt. 


Take a positive constant M > 0. Then we choose 6; such that M = —|Bo|(1 — 8j) + 
Pià;. Indeed, 6; is defined by 


ie M + |o] 
[Bol + Bi 
Then, we have 
t 
(3.21) | c(Ug(X,) --A)ds 2 Mt — on {uy £ Cm, t < Too}. 
0 


By (3.15) and (3.21), 
P(X) > Ex [6 (X,)efo UK Ads, u, E Cm, t < Too] 
> eM E [G(X)); Hi E Cm, t < Tool. 
Therefore we obtain 
Ex[6(X); us É Cm, t < To] <e, — 120. 
By (3.16), we have 


(3.022)  T,^^f(x) < IflooPrlur € Cm, t < Tool t FO lopa). 


304 H. KAISE AND S.-J. SHEU 
Applying Lemma 3.4, 
1 
lim sup - log P[k: € Cm, t < Too] € — inf I" (qu), 
HECm 


where Ĉ, is the closure of Cm. Since 1” (u) is lower semi-continuous and G,, is 
compact in AM RÀ), inf wen IV (u) is attained at some u* € Cm. Since existence 
of invariant measure implies - recurrence, it follows from Lemmas 3.5 and 3.6 and 
transience of {X;} 


inf I" (w) > 0. 
HEC 


Hence, we can find a positive constant a, > O such that 


(3.23) Pili; E€ Cast < Tol = Ce "" t>0. 
Then, from (3.22) and (3.23), we obtain 
TA f(x) « C(x) If loe ™™ + SO Hood (xen M. " 


We are ready to prove that for A — A*, the corresponding diffusion process 
(X7) is ergodic. 


THEOREM 3.7. Let (W*,A*) be a solution of (1.1) corresponding to 
A* = inf A and let (X7) be a solution of (3.1) for (W*, A*). Under (A1)-(A3), 
(X7) is ergodic. 


PROOF. Suppose that (X7] is transient. Then, by Proposition 3.3, 


T^ fa)«co)e* — vf eco, f » 0. 


Note that o is a positive constant independent of f and x. Taking 0 < € <a, we 
see that 


OO oo y* A* 
[| EADE sac [m rene at 
0 
OO 
= C(x) Í eTe- dt < oo. 

Then, there exists Green function G(x, y) for (1/2)a" Dj jV (b+aVW*)-Vte 
and G(x, y) satisfies the following: 

(324) 4a DijGC, y) + (6+4VW*)- VGC, y) -6GC, y) 20. in RV. 


We take a sequence {yn} in RY such that y, € Bn+i\Bn- Define ¢,,(x) as follows: 


G(x, Yn) 
GCO, yn) 





Prax) = x € RAD). 


SOLUTIONS OF BELLMAN EQUATION 305 
Then, we have from (3.24) 
(3.25) la" Dijdn + (b--àVW*)- Vd, ed, —0 — in RY \{yn}. 
We note that by setting W, = (1/2) log dn, (3.25) is equivalent to the following: 
a Di Wr + za! D, Wa D;W, + (b -- àVW*) - VW, + - =0 in RY \{yn}, 
where c is taken from Remark 2.1. By Lemma 2.4, we have 


sup |V W,,| < Cr r «n. 
B, 


Thus, in a similar way to the proof of existence of solutions of (1.1), we can see 
that there exists smooth function W such that 


| T d - s C ai a 
3.26) sa DyW+ b+aVW*)-VW + ZaVW VW +—=0. 


Cir] © 


Since (W*, A*) is a solution of (1.2), 
327 . ia" DjW* +b. VW* -- 3aVW* - VW* +V — A* —0. 
Adding (3.26) to (3.27), it follows from Remark 2.1 that 


T. 43 m 5 " 
0-— 54^ Dij (W* 4+W)+b-(VW*+VW) 


1 » 2 ES dx 
+ Q4VW*. VW* -àavW* - vW + zaV W -VW+V- (a* — =) 
C 


jami 


> 


;4 Dij (W*+ W) +b. (VW* -- VW) 
1 " CE T - 
F W  VW* -aVvW* - VW + j4VW VW +V-— (a* emt =) 
C 
T xs M às m 
ES 54" D W" +W)+b-V(W*+W) 
1 E x 

T 50V (W* TW). VOY*--W)-V — (^* — =), 
Thus, W* + W is a super solution of (1.1) for A = A* — e/c. In the same way 
as the proof that Æ © given in Section 2 we can see that there exists a smooth 
function W such that 
d. 56 & 1 E 5 " Š 
54 DiW + ZaVW - VW eb VW +V — A* — A 

C 


This contradicts A* = inf A. Therefore, (X7 is recurrent. 


306 H. KAISE AND S.-J. SHEU 


In order to see that (X7] is actually ergodic, we recall the proof of Proposi- 
tion 3.3. If we suppose inf co, yw" (u) > 0 where m is chosen in (3.20), we can 
prove Proposition 3.3, which implies that (X7) is transient. Hence, we see that 


(3.28) inf IV (yu) =0. 
C 


HECm 
Since Cj» is compact, (3.28) is attained at u* € Cm. Then, it follows from Lem- 
mas 3.5 and 3.6 that u* is an invariant measure for {X*}. CO 


3.2. Uniqueness of solutions corresponding to the bottom. We proved that so- 
lution (W*, A*) of (1.1) for A* = inf A corresponds to ergodicity to (X7) of (3.1). 
Now we shall show that the solution corresponding to A* is unique up to an addi- 
tive constant. Note that the solution of (1.1) has ambiguity on an additive constant. 


THEOREM 3.8. Let W?, i = 1,2, be solutions of (1.1) corresponding 
to A* =infA. Under (A1)-(A3), there exists constant k such that W5(x) = 
Wr (x) +k. 

PROOF. Since Wř, i — 1,2, are solutions of (1.2), 

laii Di Wi + Ilàvwt - Vwi --b- VWE +V — A*, 
la DW; + àVW; - VW; +b. VW; -- V = A*. 
Subtracting each side in the above equations, we have 
ian la" D (Wr — W3) + (b + 4VW3) -(VWt — VW;) 
+ AV WY — V WX) - (VW? — Wz)-0. 


Let us set d(x) = e WT 0)-W7 6). where c is in Remark 2.1. Rewriting (3.29) in 
terms of $, we have 


l |; EN T V 
~a') Dijó + (b + âV WY): Vo + —(à — Pls -Vo = 0. 
2 2c ó 

Hence it is implied from Remark 2.1 that 

(3.30) Lọ = ia Dijo + (b + àVW7) - V x 0. 


Let us take x, y € IR" and consider the SDE of (3.1) for W = W;: 
dX? = (b(X*) + GV W3 (84) dt +o(XdB,, = Xp=x. 


Define rp, = inf(t: X7 ¢ Bn}, op,(y = inf(t: Xf € Be(y)}. Note that [X7] is er- 
godic from Theorem 3.7, especially recurrent. It follows from the Itó formula and 


SOLUTIONS OF BELLMAN EQUATION 307 


(3.30) that 
" tATB, ^O BC) " 
Os p) = 600 | Ló(X3)ds 
EATB, ^O Be (y) í 
+f Vo (X,)-o(X;)dB; 
EATB, ^OBE(y) " " 
«ee + | V6(X2) -o (XT) dBy. 
Thus we have Elo (Xr, zy, Kip) < (x). By taking the limit as n > oo, 
it follows by Fatou's lemma that E[$(X7,;, o9 < $w). Noting that 
P,[op,(y) « CO] = 1, we have by sending f — oo, 
Ex[6(X%, 1 s 60. 


We note again that (X7) hits the boundary of B;(y) in finite time with probabil- 
ity 1. Hence we can see that 


(x) = Erlo (Xino) Z : inf ; 


Taking the limit as € — 0, we obtain 
popa) x, yER", 
which implies $ is constant. Therefore Wt — W» is also constant. [C] 


EXAMPLE 3.9. Let us consider the linear case: 
b(x) = Dx; 
a(x) =a, a(x) =a; 
V(x) = ix. Mx v.x. 


D, M,a,à are matrices and M is symmetric; a, â are positive-definite. We con- 
sider quadratic solution W, 


W(x) = ix-Kx-re-x. 
Then 
KaK+D'K+KD+M=0, 
(DT + Kaje+v=0, 
A = 4 tr(aK) + 1e. áe. 


If M is negative-definite, then there is a unique solution K such that K is 
nonpositive-definite and D + aK is stable. See [15]. For such K, the equation 
for e can be uniquely solved. The stability of D -- GK implies that the diffusion 


308 H. KAISE AND S.-J. SHEU 


is ergodic, o is the square root of a. This implies A defined above is equal to A* 
and W obtained is the unique solution corresponding to A*. We note there are also 
solutions that are not quadratic. 

In particular, in the one-dimensioned case, the equation for K becomes 


aK* --2DK +M «Q0. 
If M < 6, the solution is given by 


Let 
D 2 
O e (=) — M, 
a a 
D DX? 
c -- Pu (Dy n 
a a 
Then 
D 2 
ü 
and 
D A 
D-ràK.-— (z) —M-0 


Solution corresponding to K.. is W*. 


EXAMPLE 3.10. In[9, 10], the following conditions are considered: 


(a) a(x) 2 à(x) — I. 

(b) b(0) = 0; b; (x) has continuous first-order derivatives, D;b; (x) is bounded 
for all i, j. 

(c) There is co > 0 such that for all x, 9 € Ré, n- Db(x)n < —co|nl^. Here 
Db(x) = (Dibj(x))ij- 

(d) V(x), D;jV(x),i — 1,..., d, are bounded. 
Under these conditions, Fleming and co-workers prove that there exists unique 
solution (W, A) satisfying the condition 


1 
IVWQ)| x —IVV llo, 
co 


IV V lloo = sup, |V V (x)|. Let (X;) be the diffusion, 


SOLUTIONS OF BELLMAN EQUATION 309 


Denote 
I 
c = —||VV llo. 
Co 
Then 
| d 
d| Xi? = (x. -D(X + Xo VW (Xs) + =) dt -- 2X, d B,. 


By (a), (c) and the mean value theorem, 
x: b(x) € —colxl^. 
Property of W implies | 
x- VW(x) x clx]. 


By using a routine argument and considering |X; |2 exp(cot), we can prove 


1 c? 
E,(1X,P^] < exp(—cot)|x[? + - (a =), 
CQ CO 
This implies {X;} is ergodic. Therefore, A = A* and W = W*. 
In [15], different conditions are considered that are given as follows: 
(a There are c1, c2 > 0 such that 
cy € a(x) € c2, c; € (x) € ca. 
(bY There are co, ro > 0 such that 
x-b(x) x—eolxl^, lx] > ro. 


(c) aij (x), àij (x) and bj (x) have bounded first-order derivatives. 
(dy V(x), VV (x) are bounded. 


Then (1.1) with A = A* has unique solution W* with W*(0) = 0. Moreover, for 
any a, D > 0, there are co, g such that 


|W*(x)| salxl? + cap, 
IVW*()| <alxl? + cap. 
4. Perturbation of coefficients. In the present section we shall consider the 


structures of solutions of (1.1) under perturbation of coefficients. This is to con- 
sider the following equation parameterized by n € N: 


(4.1) jDi(a] DjW,) + à] Dj WrDjWntbn-VWatVa=An in RP", 
or equivalently 
lal) Dij Ws + là! D; W,Dj Wh +b, VW, + V, = An, 


(4.2) M bei 
b, (x) = b, (x) + 5Djàj (x). 


310 H. KAISE AND S.-J. SHEU 


In the same way as (2.1) we define A, as follows: 
(4.3) “vn ={An; there exists smooth W, satisfying (4.1) for An}. 


Under certain assumptions, we can show that A, = [A7, oo) for some finite Až. 
Furthermore, we can classify the set A, of A, according to the global behavior of 
diffusion process defined by the SDE: 


(4.4) dX; —(b,(X)) --à, VWa (XO) dt --o.(X)dBy, — os(x) = au(x), 


where W, is a solution of (4.1) corresponding to A, and {B;} is ¥;-standard 
Brownian motion on a filtered probability space (Q, F, P, {¥;}). Indeed, we can 
prove that A, = A7 (resp. An > Až) corresponds to ergodicity (resp. transience) 
of {X;} defined by (4.4) and W7 corresponding to A7 is unique up to additive 
constants. j 

It is interesting to study stability of solution (W7, A7) corresponding to the bot- 
tom of A, under perturbation of coefficients. Suppose that all coefficients converge 
to corresponding ones, respectively, in some sense: 


aj — a", ad — à", bn — b, V,—V | asn-»oo. 


We hope to prove that W7 and A7 converge to W* and A* = inf A, respectively, 
where A is defined by (2.1) and W* is a unique solution of (1.1) corresponding 
to A*. This means that the solutions corresponding to A7 = inf A, are stable 
under the perturbation. 

It turns out this is a delicate problem. For the illustration of the idea, we shall 
be content with the following special example. The result obtained will be used in 
Section 5. We refer to [16] for more general discussion and a counterexample. 

We now consider the following special example; we consider the following 
equation: 


(45) la" Dij Wan là" DjW,D;W,--b- VW,--V, — Aa, — Va — Vot Vs. 
The equation corresponding to the limit of (4.5) is as follows: 
(46 4a DW Fla DjWD;W Fb. VW--V —A, V=Vo+V. 
We assume the following conditions. We suppose implicitly that all the coefficients 
are smooth. 
(B1) There exists smooth function Wo such that 
Uo = — (La? Dij Wo + 4à Di WoD; Wo + b - V Wo + Vo) — oo 
as |x| — oo. 


(B2) Va, DV, are bounded in R” uniformly on n. 
(B3) V, converges to V uniformly on each compact set as n — oo. 


SOLUTIONS OF BELLMAN EQUATION 311 


Let W7 (resp. W*) be solution of (4.5) [resp. (4.6)] corresponding to Aj 
(resp. A*). 


PROPOSITION 4.1. Under (A1), (A2), (B1)-(B3), A7 converges to A*. 


PROOF. It is easy to prove that liminf,-,9 A} > A*. We shall prove 
limsup,. , 44 A5 € A*. Since W7 (resp. W*) is solution of (4.5) for A7 [resp. (4.6) 
for A*], 


Ia" Di Wx + 54 Dj7W2 DW; +b. VW + Vot Va = A3, 
la" Di W* + 54 DjiW* DjW* +b. VW* + Vo - V — A*. 
By subtracting both of sides, we have 
la" Dij (W* — Wx) + (b --àAVWz) - V(W* — We) 
+5aV(W* — Wž) V(W* — We) = A* — Aš V4 — V. 
Then, we can see that for c > 0, 


ln - 
54 Dij (c(W* — Wa) + (b + âY Wa) + V(cQW* — Wr) 
1 
+5aV(c(W* — W;)- V(c(W* — Wy) 


= c(A* — A* + V, —V)+ (ca — à)V(c(W* — Wz))- V(c(W* — WZ). 
If we take c > 0 from Remark 2.1, then 
(4.7) Ja Dij (c(W* — W;)) + È+ âV Wy) - V(c(W* — Wz)) 
+ iaV(c(W* — Wb): V(c(W* — W*)) x c(A* — At + V, — V). 


Let už be invariant measure of diffusion process X;"" defined by the following 
SDE: 


dX," — (D+ GVW) (XP") dt -o(X?")dB,, x =x. 
By the construction of už, we see that 
Ww* 
(4.8) In" (Uy) = 0, 


where 


* b. 
In" (u) =— inf f= an, p EMi, 


UEUn 
Ln = ža" Dij + (b+ âV WX) - V, 
Un = {u € C?(RY); u(x) > 0, Ly,u/u is bounded above]. 


312 H. KAISE AND S.-J. SHEU 
Set u = exp(c(W* — W7)). Then, from (4.7), we have 


MER ; 
— = za! Dij(c(W* — Wz) + È+ àVWz) - V(cQW* — WD) 


1 
(4.9) T 54V (cQW* — WH) Vic(W* — W5) 
€ c(A* — As 4- Va — V). 
Since V, V, are bounded, u € Un. Hence, by (4.8), 
| eun dy 0. 
u 
Then, the above inequality and (4.9) imply that 
c f (A*~ AL + Va — V) dug = 0. 
Therefore we have 
A*—A*> [e — Vp) dx. 
We note that {27} is tight by the proof of Proposition 3.3. Indeed, {už} € Q,, 


_ M+ Ifo| 


Cm = {we MR”); uB)zl-ÓàVImmh, 5 =———, 
m = {4 1 AG l } |Bo| + Bi 


POS AV OA El — specu iUo sca 
Uo(x) = — sup(;a" Dij Wo + Và V Wo - V Wo + b - V Wo + Va), 
o = inf Aj. 
Wo is from (B1) and m is taken so that 
p, 0 Vi 2m. 


Since V, are uniformly bounded and V, converges to V uniformly on compact 


sets, we can see that 
[  - Vo aui— o (n — oo). 


Therefore, we have 


lim sup A5 x A*. 
n—oo0 


THEOREM 4.2. Let W; (resp. W*) be solution of (4.5) for A7 [resp. (4.6) 
for A*]. Under (A1), (A2), (B1)-(B3), W7 (resp. A2) converges to W* (resp. A*) 


uniformly on compact sets, H} -strongly as n — oo. 


SOLUTIONS OF BELLMAN EQUATION 313 


PROOF. By using the estimate of VW, as in Lemma 2.4 and the argument in 
the proof of Theorem 2.6, we can show W;7 (resp. A7) converges to W* (resp. A*) 
uniformly on compact sets, HL strongly by taking a subsequence if necessary. 


Furthermore (W*, A*) is a solution of (4.6). Indeed, we see that A* = A* by 
Proposition 4.1. By uniqueness of solution corresponding to A* in Proposition 4.1, 
we have W*(x) = W*(x). OU 


5. Representation of A*. For (1.1) with the coefficients satisfying the con- 
ditions (A1)-(A3), we have proved that there is a unique solution (W*, A*) with 
W*(0) = 0 and A = A* is the smallest such that (1.1) has solution W. In this 
section we will give a representation of A*. From the representation, we will get 
some moment condition for u*, the invariant measure for the diffusion in (3.1) 
constructed from W — W*. Before we state our main results, we give some nota- 
tion. 

We shall consider a family of V. That is, we consider a particular Vo and the 
bounded perturbation of Vo, say Vo + V for bounded V. Therefore, in this section 
we consider the equation 


(5.1) IDji(a9D;W)-là" DjWD;W.-Fb.VW--Vo-V-—A — inR". 


The smallest A such that this bas solution W is denoted A*(V). We still use W* for 
the solution corresponding to A*(V). We shall mention if we want to emphasize 
the dependence of W* on V. 

In this section we assume the following condition: 


(A3) Vo is smooth and there exists a smooth function Wo such that 
Uo(x) = —($D; (a! Dj Wo) +b - VWo + 1a Di WoD j Wo + Vo) > oo 
as |x| — oo. 
We consider V satisfying the condition: 
(5.2) V is smooth, IV(x)| xc, IDV(x)| xc, 


where c is a constant that may depend on V. 
Let W be a smooth function. We denote 


(5.3) G(W) = 1Di(a DjW) +b: VW +44 DjW DjW + Vo. 
For each probability measure u on R”, we define 
J(u) = sup{—(G(W), u); W is smooth and G(W) is bounded above}. 


Here for a function f 


u)- J fœ) du(x). 


Now we can state our main results. 


314 H. KAISE AND S.-J. SHEU 


THEOREM 5.1. Assume (Al), (A2), (A3) and (5.2). Let (W*, A*(V)) be the 
solution defined above. Then A*(V) has the representation 


A*(V)= sup ((V.u)—J(Q2). 
ue M, (RY) 


Here M,(R™) denotes the set of all probability measures on RY. The supremum 
attains at u = u* , u* is the invariant measure of the diffusion in (3.1) for W = W*. 
From this theorem, we have 
A"(V) = (V, 4^) — J(u”) 
< (V, 4^) + (G(Wo), u^) 
= (V, u^) + (-Uo, u^). 


Therefore, we have the following corollary. 


COROLLARY 5.2. We assume the condition as in the above theorem. Then 
[ Wot) 4u*60 < -A*(V) + IV lo: 


In particular, 


| voco dyu* (x) < oo. 


Here |V | is the supnorm of V. 


Before we prove Theorem 5.1, we mention some elementary properties 
of A*(V). 
LEMMA 5.3. (i) A*(V) is Lipschitz with constant 1. That is, 
[A*(V1) — A*(V5)| < IVi — Valloo 


for Vi, V2 satisfying (5.2). From this, A* (V) can be defined for all bounded con- 
tinuous functions V by extension. 
(ii) A*(V) is convex in V. 


PROOF. Let Wf be the solution of (5.1) for V = Vi, A = A*(V1). Then 
I D; (a DWY) + à! D/7WE Dj Wi -- b- VWt + V+ Vo 
= A*(Vi) + V — V 
< A" (Vi) + [V2 — Villoo- 
This implies 
A*(V5) < A*(V1) + V2 — Villoo- 


SOLUTIONS OF BELLMAN EQUATION 315 


See the argument in the proof of Theorem 2.6. Similarly, 
A*(Vi) € A* (V2) + Vo — Vil. 

Therefore, 

| |A*(Vi) — A*(V2)| < Vi — Volloo. 


We now show that A*(V) is convex. Let Vi, V2 satisfy (5.2) and W7, k = 1,2, 
satisfy 


I Di (a Dj We) +54) Di WED; We +b- VW + Vo4- Vi — A*(Vg), k —1,2. 


Let 0 <A < 1 and denote W = AWT + (1 —A)Wy, V — AVi + (1 — A)V2 and 
A — AA*(V1) + (1 — A)A*(V2). Then by a simple calculation, we have 


1 Di (a! DjW) + 14H Dj/WD;W --b- VW-FVo--V EA, k=1,2. 
This implies 


A*(V) <A. 
See the proof of Theorem 2.6. That is, we have 
A*(AVj + (1 — A)V2) < AA* (V) + ( — 2)A* (V2). [] 


Denote by C; (R^) the collection of all bounded continuous functions defined 
on RY. C,(R^) is a Banach space with supnorm. The dual space C; (IRN)* can be 
identified with the set of all regular bounded finitely additive set functions defined 
on the field generated by closed sets of RY. That is, for an element T € C5 (IRP)*, 
there is regular bounded finitely additive set function yz such that 


T(V) = | VG)du(x)  VeEC;(R"). 


See [6], Theorem IV.6.2. For regular additive function see [6], Theorem IIL5.11. 
We note that A1(IR) is a subset of C5 (IRP)*. 
For u € C,(R* )*, we define 


I(u)= sup ((V,u)-— A*(V)}. 
VeC, (RÀ) 


PROPOSITION 5.4. Let V be bounded continuous. Then 


A'(V)- sup ((V,u)—I(2). 
nec, (RNy* 


see [7], Proposition 4.1, Chapter 1 or [23], Theorem 7.15. 

We shall prove that 7 (u) = J (p) if u is a probability measure. For V satisfy- 
ing (5.2), the supremum is attained at u = u*, where u* is the invariant measure 
of the diffusion in (3.1) for W = W*. Our main theorem is a consequence of this. 
We begin with some elementary observations. We follow essentially the argument 
in [23]. 


316 H. KAISE AND S.-J. SHEU 


LEMMA 5.5. If I(t) « oo, then u is nonnegative and (RY) = 1. 


PROOF. We first prove J (u) = oo if u is not nonnegative. For such jz, we take 
V = such that w(V) < 0. For any a > 0, 


I(u) = (—aV, p) — A*(—aV). 
Now A*(—aV) < A*(0), since —æ V <0. Therefore, 
I(u) = —a(V, i) — A*(0) — oo 


as o tends to infinity. Hence J (jz) = oo. 
We now prove I (2) = oo if (RP) £ 1. For such p, 


I(u) > (o, u) — A* (a) = ap (RP) — a — A*(0) =a (uR) — 1) — A*(0). 


Here o is any real number. From this, it is easy to see that Z (u) — oo. O 
LEMMA 5.6. Let u be a probability measure. Then I (u) > J(u). 


PROOF. Let W be a smooth function such that G(W) is bounded above. Take 
V, = min( - G(W), n]. 


Then V, is a bounded continuous function. 
It is easy to see that 


1 Di (a DjW) + 347 DjWDjW +b. VW + Vo 4- V, x 0. 
Therefore, A* (Vn) < 0. From the relation 
I (yw) > (Vas u) "ue A* (Vn) = (Vis Le), 


and V, — —G(W), Vn x V44.4 such that V, are bounded below, we can apply the 
monotone convergence theorem to get 


[(w) = —(G(W), m). 
Then 
I (u) > sup{—(G(W), ); W is smooth, G(W) is bounded above} = J (u). LJ 


LEMMA 5.7. Let V € Cy(R^). Define 


J*(V)= sup ((V.u)—J(u). 
p eM RN) 


Then J*(V) x A*(V). 


SOLUTIONS OF BELLMAN EQUATION 317 


PROOF. It is enough to prove this for V satisfying (5.2) which we assume 
now. We first observe the relation 


rv) = sup (v, u) + inf{(GOW), 2 
H 
= supinf((G(W) + V, u)] 
m W 
< infsup{(G(W) + V, u)j 
W u 
E inf sup (G(W)(x) + V (x)). 


' xERN 


Here the u, W are taken over all those satisfying u € M,(R”) and W smooth 
with G(W) bounded above. Let W* be the solution of (5.1) for A = A*(V). Take 
W = W* in the above relation; we have J*(V) x A*(V). LI] 


LEMMA 5.8. Let V satisfy (5.2) and let u* be the invariant measure of the 
diffusion in (3.1) for A = A*(V) and W* the solution of (5.1). Then 


A*(V) = (V, u^) — H^). 


PROOF. We need to prove that 
(V^, u*) — A*(V^) < (V, u*) — A*(V) 
for all V’ € C, (R^). This is equivalent to 
A*(V)— A* (V) 2(V'-V,u*), — V'eCs (R^). 
Since A*() is a convex function on C,(RY), there is a subgradient 
jz € C, (RP)* of this function at V such that 
A*(V)— A*(V) 2(V'-V,g), | V eC (R"). 


See [7], Proposition 5.2, Chapter 1. We only need to prove 4 = y*, since this 
implies the claim. 

First, the nondecreasing of A*(-) implies jz is nonnegative. 

Applying the above relation to V’ = V + o, æ is constant, and using 
A*(V +a) = A*(V) +a, we can easily deduce AR”) = = |, 

Now take $, smooth functions on R satisfying the followin g properties: 0 < 
Qn <¢n+1 X 1, n has compact support, Von are bounded uniformly in n, $, > 1 
uniformly on compact sets. Then Proposition 4.1 implies A*(V --a$,) > A*(V + 
a) = A*(V) +a as n — oo. Then 

limsupa ($n, A) x o 
n-—oo 


for all a. From this, we have 
jim (1 — Gn, 4) —0 


318 H. KAISE AND S.-J. SHEU 


Then 4 must be a probability measure. This is a consequence of [6], Theo- 
rem III.5.13. 
We now prove 


J (3a (x) Dij W(x) + (b(x) 4-àVW*(x)) - VW (x)) di (x) =0 


for W smooth function on IR" with compact support. This implies £4 is an invariant 
measure of the diffusion in (3.1). Hence 4 = u* by the uniqueness of the invari- 
ance measure. To prove this last statement, we take such W and consider 


V' =V — (ja! DW + (b+ àVW*) - VW +54 DjWD;W). 
Then by a simple calculation, we see 
Ia Di(W +W*)+b-V(W+W’%*) 
+54 Di(W + W*) DjCQW + W*) + Vo + V' = A*(V). 
Therefore, A*(V^) = A*(V). Then, 
02 (V' — V, B) =—(5a DW -- (b -- aVW*)- VW +â" D/WD;W, ñ). 
We replace W by aW, o > 0, divide the relation by a and let œ tend to 0. We get 
-(la/ DW + (b+ àVW*) VW, ji) <0. 
We replace W by —W. Then we find 
(ža DjjW + (b -àVW*) VW, ji) — 0. 
This is what we want to prove. L] 


COROLLARY 5.9. Let V € Cy(R”). Then J*(V) = A*(V). Let u be a prob- 
ability measure. Then I (yu) = J (p). 


PROOF. Weassume V € CRY) satisfying (5.2). Let u* be the invariant mea- 
sure for the diffusion in (3.1) with W — W*. Then 


J*(V) > (V, u*) — J(4^) z (V, u*) — H^) = A* (V). 


But we have already proved J*(V) < A*(V). Therefore, they are equal. 
We now prove 7 (jz) = J(u). We use the relation 


J(u)- sup ((V,u)—J"(V)). 
Veci (RN) 


See [23], Theorem 7.18. By definition, 


I(p)- sup ((V,u)— A*(V)}. 
V eC (RN) 


Since J*(V) = A*(V) for all V, we have I (u) = J(u). O 


SOLUTIONS OF BELLMAN EQUATION 319 


Acknowledgments. The authors would like to thank the referees for helpful 
comments and suggestions. This research was done while the first author was a 
post-doctoral Research Fellow at Institute of Mathematics, Academia Sinica, Tai- 
wan. The first author is grateful to Academia Sinica for giving him an opportunity 
to visit there for his research. 


REFERENCES 


[1] BENSOUSSAN, A. (1988). Perturbation Methods in Optimal Control. Wiley, New York. 

[2] BENSOUSSAN, A. and FREHSE, J. (1992). On Bellman equations of ergodic control in RN. 
J. Reine Angew. Math. 429 125-160. 

[3] BIELECKI, T. R. and PLISKA, S. R. (1999). Risk sensitive dynamic asset management. 
Appl. Math. Optim. 39 337-360. 

[4] DONSKER, M. D. and VARADHAN, S. R. S. (1975). Asymptotic evaluation of certain Wiener 
integrals for large time. In Functional Integration and Its Applications (A. M. Arthurs, 
ed.) 15-33. Oxford Univ. Press. 

[5] DONSKER, M. D. and VARADHAN, S. R. S. (1976). Asymptotic evaluation of certain Markov 
processes expectations for large time III. Comm. Pure Appl. Math. 29 389—461. 

[6] DUNFORD, N. and SCHWARTZ, J. T. (1988). Linear Operators Part I: General Theory. Wiley, 
New York. 

[7] EKELAND, I. and TEMAN, R. (1976). Convex Analysis and Variational Problems. North- 
Holland, Amsterdam. 

[8] FLEMING, W. H. (1995). Optimal investment model and risk-sensitive stochastic control. In 
IMA Vols. in Math. Appl. 65 75-88. Springer, New York. 

[9] FLEMING, W. H. and JAMES, M. R. (1995). The risk-sensitive index and the Hy and Hoo 
norms for nonlinear systems. Math. Control Signals Systems 8 199—221. 

[10]. FLEMING, W. H. and MCENEANEY, W. M. (1995). Risk-sensitive control on an infinite time 
horizon. SIAM J. Control Optim. 33 1881—1915. 

[11] FLEMING, W. H. and SHEU, S.-J. (1999), Optimal long term growth rate of expected utility 
of wealth. Ann. Appl. Probab. 9 871—903. 

[12] FLEMING, W. H. and SHEU, S.-J. (2000). Risk sensitive control and an optimal investment 
model. Math. Finance 10 197—213. 

[13] KAISE, H. and NAGAI, H. (1998). Bellman-Isaacs equations of ergodic type related to risk- 
sensitive control and their singular limits. Asymptotic Anal. 16 347—362. 

[14] KAISE, H. and NAGAI, H. (1999). Ergodic type Bellman equations of risk-sensitive control 
with large parameters and their singular limits. Asymptotic Anal. 20 279—299. 

[15] KAISE, H. and SHEU, S.-J. (2004). Risk-sensitive optimal investment: Solutions of dynami- 
cal programming equation. In Mathematics of Finance, Contemporary Math. 351. Amer. 
Math. Soc., Providence, RI. 

[16] KAISE, H. and SHEU, S.-J. (2004). On the structure of solutions of ergodic type Bellman 
equation related to risk-sensitive control. Technical report, Academia Sinica. 

[17] KAISE, H. and SHEU, S.-J. (2004). Evaluation of large time expectations for diffusion 
processes. Technical report, Academia Sinica. 

[18] LADYZHENSKAYA, O. A. and URAU'TSEVA, N. N. (1968). Linear and Quasilinear Elliptic 
Equations. Academic Press, New York. 

[19] MCENEANEY, W. M. and ITO, K. (1997). Infinite time-horizon risk-sensitive systems with 
quadratic growth. In Proceedings of 36th IEEE Conference on Decision and Control. 

[20] NAGAI, H. (1996). Bellman equation of risk-sensitive control. SIAM J. Control Optim. 34 
74—101. 


320 H. KAISE AND S.-J. SHEU 


[21] NAGAI, H. (2003). Optimal strategies for risk-sensitive portfolio optimization problems for 
general factor models. STAM J. Control Optim. 41 1779-1800. 

[22] PINSKY, R. G. (1995). Positive Harmonic Functions and Diffusion. Cambridge Univ. Press. 

[23] STROOCK, D. W. (1984). An Introduction to Theory of Large Deviations. Springer, Berlin. 

[24] VARADHAN, S. R. S. (1980). Diffusion Problems and Partial Differential Equations. Springer, 


New York. 
GRADUATE SCHOOL OF INFORMATION SCIENCE INSTITUTE OF MATHEMATICS 
NAGOYA UNIVERSITY ACADEMIA SINICA 
FURO-CHO, CHIKUSA-KU NANKANG, TAIPEI 11529 
NAGOYA 464-8601 TAIWAN 
JAPAN REPUBLIC OF CHINA 


E-MAIL: kaise @is.nagoya-u.ac.jp E-MAIL: sheusj ? math.sinica.edu.tw 


The Annals of Probability 

2006, Vol. 34, No. 1, 321-385 

DOI: 10.1214/009117905000000567 

Q Institute of Mathematical Statistics, 2006 


LARGE DEVIATION FOR DIFFUSIONS AND HAMILTON-JACOBI 
EQUATION IN HILBERT SPACES 


By JIN FENG 
University of Massachusetts-Amherst 


Large deviation for Markov processes can be studied by Hamilton- 
Jacobi equation techniques. The method of proof involves three steps: First, 
we apply a nonlinear transform to generators of the Markov processes, and 
verify that limit of the transformed generators exists. Such limit induces a 
Hamilton-Jacobi equation. Second, we show that a strong form of uniqueness 
(the comparison principle) holds for the limit equation. Finally, we verify an 
exponential compact containment estimate. The large deviation principle then 
follows from the above three verifications. 

This paper illustrates such a method applied to a class of Hilbert-space- 
valued small diffusion processes. The examples include stochastically per- 
turbed Allen-Cahn, Cahn-Hilliard PDEs and a one-dimensional quasilinear 
PDE with a viscosity term. We prove the comparison principle using a variant 
of the Tataru method. We also discuss different notions of viscosity solution 
in infinite dimensions in such context. 


1. Introduction. We are interested in large deviation for small randomly per- 
turbed diffusion processes in a Hilbert state space E. When E = R^, this is known 
as the Freidlin and Wentzell theory [23]. The proofs in [23] rely upon the Girsanov 
transformations. The idea is to estimate probability of an atypical, Jarge deviant 
event under the given probability law through a change of measure, so that the 
event becomes most probable under the new law. Such technique is also repeat- 
edly used in the Donsker and Varadhan theory [13] regarding occupation measures, 
which is another kind of large deviation concerning ergodic phenomena instead of 
small random perturbations. 

There exists a different approach to the above mentioned large deviation 
problems. In the late 1970s, Fleming [19] introduced a logarithmic transform to 
generators of Markov processes, giving exit probabilities an optimal control in- 
terpretation. This observation allowed us to characterize the large deviation con- 
vergence for exit probabilities as convergence of solutions for a sequence of 
Hamilton-Jacobi equations. Later, Evans and Ishii [16], Fleming and Sougani- 
dis [21, 22], among others, applied the theory of viscosity solution to this context, 


Received January 2002; revised January 2005. 

AMS 2000 subject classifications. Primary 60F10; secondary 60325, 49L25, 60699, 

Key words and phrases. Large deviation, stochastic evolution equation in Hilbert space, viscosity 
solution of Hamilton-Jacobi equations. 


321 


322 | J. FENG 


enabling the approach to cover a wider variety of examples. In particular, this in- 
cludes R¢-valued diffusions with vanishing stochastic terms. During early devel- 
opments of this approach, the applicable settings and structural conditions required 
were relatively restrictive as compared to the Girsanov transformation approach. 
However, this can be fixed by refining techniques on the viscosity solution tech- 
niques and on the large deviation theory. Feng and Kurtz [18] recently carried out 
such a program which expands the theory. 

The general setting in [18] allows the state space E to be a metric space. One 
of the key technical conditions assumed is a strong form of uniqueness (i.e., the 
comparison principle, Definition 1.15) for a limit Hamilton-Jacobi type equation. 
For small perturbation type large deviations, such equation is usually a first-order 
nonlinear partial differential equation. When E = R^, or a subset of it, the compar- 
ison principle can usually be verified by well-known criteria in PDE theory. Using 
these techniques, [18] treats both the classical Freidlin-Wentzell theory and the 
Donsker- Varadhan theory within one framework using the generator convergence 
approach. 

When we study large deviation for stochastic PDEs or interacting particles, we 
usually encounter function- or measure-valued state space. Comparison principles 
of these types, however, are much less well understood. On the one hand, there 
exists. an extensive PDE literature regarding first-order Hamilton-Jacobi equations 
in Hilbert/Banach spaces (e.g., [7, 8, 31, 32]). On the other hand, the operators 
derived in the large deviation context frequently exhibit subtle differences relative 
to those studied in the PDE literature. Indeed, in the case of applications to inter- 
acting particle systems, it is more natural to consider the state space as the space 
of probability measures, rather than a Banach space. 

We restrict attention to Hilbert-space-valued diffusions only in this paper. 


1.1. Background. We consider the large deviation for Hilbert-space-valued 
diffusions with a possibly nonlinear drift term. To outline the approach and identify 
difficulties ahead of us, first, we review a general result (adapted to the situation of 
this paper) developed in [18]. 

Let X,, n = 1,2,..., be a sequence of metric-space S-valued random variables. 
Varadhan and Brycs (e.g. Theorems 4.3.1 and 4.4.2 in [12]) discovered the follow- 
ing moment characterization of large deviation convergence. 


PROPOSITION 1.1. 


(a) Suppose (X4) satisfies the large deviation principle (Definition 1.17) with a 
good rate function I. Then for each f € Cy(S) (bounded continuous functions 
on S), if we define A, (f) — n log Elexp(nf (X4)]]. 


(1.1) lim A,(f)= lim liog Ele"? 092] — sup( f(x) — 160) = ACD. 
n—» +00 n>+x n xes 


LARGE DEVIATIONS IN HILBERT SPACE 329 


: (b) Suppose that (X4) is exponentially tight (Definition 1.17) and that the 
limit (1.1) exists for each f € Cp(S). Then {Xp} satisfies the large deviation 
with good rate function 


(1.2) |. l()-2 sup (fG) — A(f)). 


feC,(S) 


See Theorems 4.3.1 and 4.4.2 in [12]. 

The main result in [18] can be viewed as a process version of the above theorem, 
expressed at an infinitesimal level. 

To explain the result, we proceed informally first. Let {X,(t),0 < t < +00; 
n=1,2,...} denote a sequence of metric-space E-valued Markov processes. For 
simplicity, we assume the trajectories are continuous. By the Markov property 
and continuity of the trajectories, we expect the large deviation of (X, follows 
from that of the transition probability measures {P(X,(¢) € dy|X,(O) = x)} for 
all x € E and t > 0. By Proposition 1.1, we also expect this to be implied by 
convergence of the functionals 


Vt) f (x) = ~ log Ele SOX, 0) =x] >VOf (x) for some VO f, 


where f € DC C,(E), and D is sufficiently dense in C5 ( £) in appropriate sense. 
It turns out that, by the Markov property, V, forms a nonlinear operator semi- 


group 
ValS) Va) m Wat +s), 5,220. 


Hence {V(t):t > 0}, viewed as a collection of operators acting on functions, 
should form a semigroup as well. We identify the generator of V, next: 


1 
Hn f œ) = lim + (Vn (t) f (x) — nO f (x)) 


11 
= lim —-— log Eler U Zr) -Sf E. 
jim ~~ log Ele \X,(0) =x] 


where A, is the generator for the process X, 
. 1 
Ang(x) = um. » log E[g( Xs (t)) — g(x) |Xn (0) = x]. 


The above transformation from A, to H, is essentially the logarithmic trans- 
form by Fleming [19]. If we denote by H the generator for V, then we expect gen- 
erator convergence Ha — H will imply semigroup convergence Va, — V, which 
is suggested by the semigroup generation theorem: 


ck 
t 
V,(t)h = Lum (1 — Hn) h 


324 J. FENG 


and 
Í =k 
(1.3) V(t)h— hm (1-8) h. 
k— roo k 


Going backward in the reasoning, modulo regularity conditions, we expect con- 
vergence H, — H will give the large deviation of {Xn}. 

There are practical problems if we want to rigorously apply the above program 
to examples. First, the formula in (1.3) requires 


(1.4) (I —aH)f =h 


to hold in the classical sense for all h € D and a > 0 (if H is dissipative, then such 
f is also unique). However, this is extremely hard to verify for most examples. 
Therefore we are forced to modify the above formulation by using a type of weak 
solution called the viscosity solution (Definition 1.14). By weakening the type of 
solution needed for (1.4), we have to require a strong form of uniqueness condition 
known as the comparison principle (Definition 1.15). Informally, this principle 
states that, if upper semicontinuous f and lower semicontinuous f satisfy 


(—oaH)f xh and (I—oH)f zh, 


then f < f. The f and f are called, respectively, a subsolution and a superso- 
lution. Existence of sub- and supersolutions, in the large deviation context here, 
can be constructed by generalizing a procedure due to Barles and Perthame [2, 3]. 
When £ is noncompact, in order for such argument to go through, we need another 
crucial condition (Condition 2.2) for the transition probabilities. Such condition re- 
quires the processes to be concentrated on a compact subset of the state space with 
high probability. 

Note that the above formulation is only based on inequalities for sub- and su- 
persolutions. This provides an opportunity for further relaxations on conditions. 
We can introduce two more operators: Ho, H1 so that Hf < Hof and Hf > Hif 
for all f € D(H) n D(A). Then 


(I—aHo)f<h and (I—-aHy)f >h. 


Suppose that the comparison principle still holds for the above two “in-equations” 
(e, f < f). The construction of f, f by the Barles—Perthame procedure then 
reveals that f = f = f € C (E). Hence, each h uniquely corresponds to an 
f € Co CE), and we can denote such correspondence by f = Rgh. Consequently, at 
least formally, Ry = (I — oH). In other words, Ho, H1 implicitly determine H 
through its resolvent, and V (7)h = lim, Ri, „h € Cy (E). We can now completely 
avoid using H in the above program by replacing condition Ha — H by: for each 
f € D(A), 


Hf < lim inf H, fas lim sup Hn f, < Hof some f, f. 
n 


—m 


LARGE DEVIATIONS IN HILBERT SPACE 325 


The above generalization is useful for applications where E is infinite dimen- 
sional. We illustrate this next. 

In general, the comparison principle proof relies upon test functions which be- 
have like distance functions. For instance, in the case E = R4, these test functions 
take the form f(x) = (u/2)lx — yl, u € R (see [5]). For H,'s which are dif- 
ferential operators, there is no difficulty to include such functions in the domain. 
Furthermore, identifying H f as a limit of H, f is usually straightforward. 

However, the situation becomes tricky when E is infinite dimensional (e.g., Ex- 
amples 1.2, 1.5 and 1.8). For instance, let E = L*(Q) and © = [0, 1) with periodic 
boundary, and . 


Hy f (x) = (Ax, Df (x)) + 3I Df GI? + of), 
where o ; may depend on f. See (1.33) for definition of Df. We expect 


Hf (x) = (Ax, Df (x)) + 5 Df Il". 
But then, even for i 
(1.5) foe)-2(qQ/2Ix—yl*, HER, 


where y is arbitrarily smooth, 
um Hy f (Xn) # H f (x). 


Note that in this case, Df (x) = u(x — y) and (Ax, Df (x)) is well defined as a 
function taking value in extended reals 


(Ax, Df (3) = u(Ax, x — y)  —ullVxll? + n Vxi Vy). 


Assuming u > 0, by lower semicontinuity of | Vxll, we can however obtain 
lim sup,, |, Hn f (xs) x Hf (x). Similarly, by reversing the inequality, we can ver- 
ify a lower bound estimate for the case u < 0. 

More generally, if the above A is replaced by a general nonlinear dissipative 
operator C [assuming the domain of C is not the entire E, D(C) Æ E], integration 
by parts may not even make sense any more. Consequently 


Hf (x) = (Cx, Df (x)) + IDEN? 
does not make sense for all x € E, even if y € D(C). Note that, if C is dissipative, 
(Cx, x — y) < u(Cy,x— y) Vx,yEe D(C), u>0. 


The right-hand side of the above is continuous in x, and can be extended to all 
x € E easily. Hence at least for f of the form (1.5) with u > 0, if H, f (x4) is 
defined, then it can be estimated from above by 


lim sup Hn f (xa) < lim w(Cy, xs — y) + EDF en) I? +070) 


Xp cX 


= u(Cy, x — y) + iI DE? = Ho f(x). 


326 J. FENG 


Furthermore, such Ho f € C(E) and is everywhere well defined. By the arbitrari- 
ness of y € D(C), we hope such Ho provides a sharp.estimate on the asymptotics 
of H,'s. Similarly, we can also estimate H, f from below by some Hj f, if u <0. 

We call (1.4) a Hamilton-Jacobi equation, because of its connection with the 
optimal control problem. By the Markov property on the processes X,, the A,'s 
satisfy the maximum principle. The H,,’s, obtained as a transform of the A,'s, also 
satisfy a nonlinear maximum principle. So does the limiting H (and frequently, 
Ho, H1). Using this property, the following variational representation of H, can 
usually be proved [17]: 


H, f (x) = sup (By f Gs, u) — Ly(x, u)), 
uc 


where U is some auxiliary metric space, and for each u fixed, B, f (-, u) is a linear 
operator satisfying the maximum principle in Cp (E), Ly, is a lower semicontinuous 
bivariate function. In the limit, H is supposed to have a similar structure. This 
is known as the Nisio representation of generator for Hamiltonian operator H in 
optimal control theory [25]. Based upon such representation, [18] proved theorems 
ensuring a simpler, variational representation of the rate function in an "action 
integral" form. 


1.2. Basic setup. Let Hilbert-space-valued diffusion processes 
^ l 
(1.6) d X4(t) = Cy Xn(t) dt + Ja PUn (1)) qW (t), 
where W is a cylindrical Wiener process [see (1.28)] on a separable real Hilbert 
space Uo, E is another separable real Hilbert space, C, — œI is an m-dissipative 
(possibly) nonlinear operator on E for some o > 0: that is, 
(Cnx — Ĉny x —y) < olx- yl? Yx, y€ D(Ch), 
and the range of [ — aC, satisfies 
RI—-aC,)=E Va>0. 


We also assume that 0 € D(C,,), and B, (x): Ug — E is a Hilbert-Schmidt oper- 
ator for each x € E fixed. More conditions are needed in order to make sense of 
the solution and large deviation result of (1.6); we delay them until the statement 
of the respective theorem. To simplify the presentation, we will actually deal with 
another form of the above equation: 


(1.7 — dX4(t) = C, Xn) dt + Fí(X4G)) dt + n B. Ot.) AWE), 


where C, is m-dissipative, C,0 = 0 and F(x) : E — E is globally Lipschitz in x. 
To rewrite (1.6) into the form of (1.7), we take 


Cht = C.x — Ĉ,0 — WX, Bax) C,0 + wx. 


We provide three examples. 


LARGE DEVIATIONS IN HILBERT SPACE 327 


EXAMPLE 1.2 (Stochastic Allen-Cahn equation). Let © = [0,1)7, d = 
1,2,3,..., with periodic boundary condition; we associate L? (©) with the usual 
inner product 


(x, y) = | x(0)y(@)d0, x, y e L2(0). 
0zz(04 040)e0 


By a stochastic Allen-Cahn equation, we refer to the following formally written 
stochastic PDE 


9 y, (t0) =AYa(t, 0) — V' (Y, (t, 0) 


(1.8) d+1 


1 ð 
— : Y4 C) —— Bp (1,0), 
+ uno YO rg T 079) 
where A(t, 0) is a Brownian sheet over space-time (t, 0) € [0, oo) x 0, V e C!(R) 
and 


(1.9) a (0,y) = 9, (y, £15... (Y, x) 


for some £i,..., & € L?(0) and 9(0, ri, ... , rg): RET! > R. 

The above is a stochastically perturbed reaction-diffusion type equation. 
Among other applications, it has been used in material science as a phenomeno- 
logical model of material interface movements due to molecular-level adsorption— 
desorption processes. V € C!(R) is usually a double- or multiple-well potential 
function. From large deviations for {Y,}, we can extract information about metasta- 
bility of the whole system when the temperature is small. 

It is well known that, in dimension d > 2, (1.8) admits no L?(0)-valued so- 
lution. We will actually consider an approximate version of it which is defined 
on truncated Fourier modes. The number of modes goes to infinity as n — +00. 
Such consideration is motivated by the fact that, in the above mentioned applica- 
tion, (1.8) should only be viewed as a formal limit for some stochastic Ginzburg- 
Landau equation defined on finite lattices or on truncated Fourier modes [30]. It 
is usually the rescaling limits of these finite systems which we really care about, 
rather than the continuum level (1.8). We mention that large deviation for the lat- 
tice case is studied in [18]. 

To rigorously define the processes, we let 


$1(r) & I, dok—1(r) = A/2 cos(2zt kr), 
éak(r) S /2sinQzkr), k=1,2,..., 
and 
H1 =0, Hak-1 = pay = 4n? k’. 
It follows that 
(1.10) —$% = ujoj. 


328 J. FENG 

Therefore 
{ex (0) = Pk, (61) x Pra (62) x --- x dk, (04). 
ke (kis... ka), kj =1,2,..:7=1,...,d] 


(1.11) 


forms a complete orthonormal basis for E = Up = L?(@). Denote 
(1.12) Àk = Bk, X c X [kgs 
then 
— Aer == Àger. 
Let (Bx (t), k = (ki, ..., ka), kj =1,..., j — 1,..., d) be a sequence of i.i.d. real- 
valued standard Brownian motion, and let 


61 04 
Bem B0] f ean ra dri dra 
k rj=0 rq:—0 


We define an L^(9)-valued cylindrical Wiener process 





(1.13) W(t) 2 5 Oer. 
k 
Suppose 
m^ 
(1.14) m my, = oo, sup « OQ. 


[This scaling is needed in (A.9) when verifying the exponential compact contain- 
ment property (Condition 2.2) for the processes. In addition, it is also used to verify 
Condition 1.11(3) in Theorem 1.10.] 

Let projection operator 


Mn My 
Pax = 2 eae > (x, ege € span(ei, -.., ny) 
ky=1 kg=1 


and for each x € L?(Q) fixed, we define linear operator B(x) on L*(@) by 
(1.15) (B(x)u)(0) =0 (0; x)u(0), | ue Uo S L7(0). 
We regularize linear operator 
(1.16) By (x)u = P,(BCPx)u), 
and arrive at an L7(@)-valued diffusion 
1 


= Bn (Xn 0)) dW (t), 


(1.17) dX,(t) = AP, Xn(t)dt — P V’ (Pa Xn (t) dt + E 


LARGE DEVIATIONS IN HILBERT SPACE 329 


where the term B4(X,(t)) dW (t) is understood as 


B,(Xs(0) dW (t) 2 0 (5 Pa Xn) $ o Y. dpe ex. 
ky=1 kai 


Equation (1.17) can be written in the form of (1.7). Let c = sup. 45, «o5l V" (r)] 
and 


(118)  Cyx = AP,x — P  V' (Pax) + PV (0) —oP,x, | xeL^(0), 
Fa (x) = — Pa V (0) + oPax. 

Then C40 = 0, C, is m-dissipative in L? (0) (Lemmas A.2 and A.3) and 

j} | 

n Pun (t) aW (1). 


To prove the large deviation theorem, we assume: 


. dX,(t) = Cp Xn (t) dt + FG) dt + 


CONDITION 1.3. 
(1) V e C*(R) and 


sup |V"(r)| « oo. 
—OoQ«r-«oo 


(2) There exist c1, c2 > 0, such that 
V(r)zoa-c er^. 


(3) The q(0, r1, ..., rà): O x R* — R in (1.9) is bounded continuous and Lip- 
schitz in r1, ...,r&, uniformly with respect to 0. 


Condition 1.3(3) implies that operator B(x) defined by (1.15) is Lipschitz in x: 
B(x) — BO) 
—— < © 
xxy |x- Yllr2(o) 
where |l - ||| is the operator norm 
IBOOM= sup Buloy 
Iulr2(5,51 


For each n fixed, (1.17) can actually be represented as a finite-dimensional sto- 
chastic ODE; therefore the existence and uniqueness of the solution hold by stan- 
dard finite-dimensional results. 

Applying the main theorem of this paper, Theorem 1.10, we have: 


THEOREM 1.4. Under Condition 1.3 and scaling relation (1.14), the solu- 
tions Xn (t) of (1.17) satisfy a large deviation principle in C,2(g)[0, o0) with good 
rate function I as defined in (1.31). 


330 J. FENG 


With a mild amount of additional work, assuming info,x o (0, x) > 0, the rate 
function 7 can be represented more explicitly: 


T I (x) = Io(x(0)) 
| ZI a het EA) 
5 o (0, x (t) 


Feng and Kurtz [18] discuss this type of representation in general. See Section 4.1 
for an outline of the approach. 


d0 dt. 


EXAMPLE 1.5 (Stochastic Cahn—Hilliard equation). We still consider © = 
[0, 1)? with periodic boundary condition, but with the restriction d — 1,2 or 3 
now. We consider stochastic perturbation of the Cahn—Hilliard equation formally 
given by 


(1.20) Y. 0) — A(— AY, (1,0) -- V'(Y.(t, 0))) + = a: xL p (t, 0), 
where B (t, 0) is a Brownian sheet on [0, 00) x Q; or, equivalently, 

8 gdt1 

3; In 0) + dive(V(AY, (t, 0) — V'(Y, (t, 6)))) = Jn 8: TH PEETA 0). 


Y, is asymptotically conserved in the sense that, in the n — +00 limit Y, 
f YC, 0) d0 is constant in time. As in Example 1.2, such an equation has extensive 
applications in material science. Motivated by the same reason as before, we only 
rigorously study the large deviation for a variant of (1.20), which is defined on 
finite Fourier modes. The number of modes goes to infinity slowly, as n goes to 
infinity. 

We assume: 


CONDITION 1.6. 
(1) V e C?(R) and 
sup [VO -- V" (r)| « oo. 


—OQ«r «oo 


(2) There exist c1, co > 0, such that 
Ví(r) zci4- cor. 
(3) 


3d 





(1.21) im m = 00, 


LARGE DEVIATIONS IN HILBERT SPACE 331 


The above scaling requirement on m, is needed for reasons similar to those in 
the previous Allen—Cahn example. See the proof of (A.14) and the requirement in 
Condition 1.11(3). 

We choose E = Ug = L?(0). We define e1, ...,ez, ..., P, and W as in Exam- 
ple 1.2 and 


| Byu=Pyu  WueUo=L7(O). 
We consider L*(@)-valued diffusions: 


(122 — dX4(t) = AP (~A Pp Xn (t) dt + V' (Pr Xn(t))) dt + m. dW (t), 


Jn 
where 
Mn My 
BrdW(t)= D> +: Do ex dBi(t). 
k1-1 kq=l 

Let o = i sup, |V"(r) |, 
(1.23) Crx = APy(—APyx + V'(Pyx)) — o Pax, 
and 

Fw) = @Pyx. 


Then (1.22) can be written in the form of (1.7): 


dX, (t) = CyXp(t) dt + F, (Xn (©) dt + om dW(t). 
/n 


THEOREM 1.7. Under Condition 1.6 and the scaling relation (1.21), the solu- 
tions X, (t) of (1.22) satisfy a large deviation principle in C j2(9[0, co) with good 
rate function I as in (1.31). 


As in the stochastic Allen-Cahn example, the rate function 7 can be further 
simplified: 
I (x) = Io(x(0) 


] rœ 

toh 

2Jo Jo 

Another type of stochastic perturbation [30] to the Cahn—Hilliard equation could 
also be interesting: 


2 
at 


(1.24 


2 
ax, 0) — Ae(—A^ex(t,0) + V'(x(t, @)))| dé dt. 








Y, (t, 0) + dive (Vo(Ao¥at, 0) — V'(Y, t, 0))) + he =. 


Jn oatae 


332 J. FENG 


where 


B(t,0) = (B1(t, 0), .... Ba (t, 0)) 


with each fi an independent real-valued space-time Brownian sheet. Large devi- 
ation for a lattice version of such an equation is considered in [18]. 


EXAMPLE 1.8 (Stochastic quasilinear equation with viscosity). Let © = 
[0, 1) with periodic boundary. Suppose $ € C!(R) and sup, |¢’(r)| < oo. We con- 
sider the following pw defined — 

ð 
ar I8) = (Yn (t,0)) =a— 


“Yat 0) + —— 9 pu, 9), 


A 5 ir 90 
where B(t,0) isa MC sheet, (t, 9) € [0, oo) x øO. 
As before, let E = U = L^(0) with norm |x|? = fo x^(0) dO. (ei, ..., ex, ...] 
is the complete orthonormal basis as defined in (1.11) with d — 1. Define 
L?(@)-valued cylindrical Wiener process 


W(t) = $^ Belter 


k=] 
where (£1, B2, ...) are i.i.d. standard Brownian motion. Let 
(1.25) jim, My =O, sup pus "e 
and projection 
2My 
Pax = 2 05 ek)€k. 
k=1 


We consider a regularized L7(Q)-valued diffusion equation, 


(1.26) dXy(t) =aAgP,Xn(t) dt — Py deh(PaXn(t)) dt + "ul dW (t), 


where B, = P, and 


2 mg 
B, dW (t) = 5 | end (t). 
k=l 
The scaling on mp is needed for the same reason as in the previous two exam- 


ples. 
Let w = (sup, $'(r))^/ (4o), 
(1.27) C,x = œ Ag Pax — Pn 3gp (Pax) —oP,x, — xeL^(0), 
and 
Fa = wba, F(x) = ox. 
Then X, satisfies (1.7). 


LARGE DEVIATIONS IN HILBERT SPACE 333 


THEOREM 1.9. Suppose that (1.25) holds and that (X,(0)) satisfies a large 
deviation principle with rate function Ip. Then X, € Cy2(9)|0, oo) satisfy a large 
deviation principle with rate function given by (1.31). 


With additional work, it can be shown that the rate function 7 admits the fol- 
lowing form: 
9 9? ð 2 
— x —A—x + — oh dé dt. 
ap 992% * ag? 





I(x) = I6) + 5 Í ü [ 


1.3. Technical assumptions and main results. Let (E, || - ||) and (Uo, | - luo) 
be two separable real Hilbert spaces. Let (e1, e2,...} be a complete orthonormal 
basis of Up. We define W, a cylindrical Wiener process on Uo, by 


OO 

(1.28) Wt) =>} ext), 20, 

k==1 
where £1, 62, ... are i.i.d. real-valued standard Brownian motions with respect to 
a filtration F. X, (0) is independent of F and we write F” = F V o (Xn (0)). The 
infinite sum in (1.28) does not converge in (Uo, || : llug). However, we can always 
embed Up continuously into another separable real Hilbert space (U4, || - |i), and 
as far as the embedding is Hilbert-Schmidt, the right-hand side of (1.28) converges 
in (Uj, || - ly,). For example, let A > 0 be such that 129. Az < oo; define U; to 
^ be the completion of Up under 


(u, v)u, = Y Ak(v, ex) ug (U, ek) ug. 
| - | 


Then the embedding Up — U; through identity map J is Hilbert-Schmidt. 
Throughout this paper, we will denote by L2(Uo, E) the space of Hilbert- 
Schmidt operators from Uo into E with norm 


Il BIZ cu = >, Berl? 
k 


(1.29) B 
—Tr(B*B) =Tr(BB*)  VBeLa(Uo, E). 


Therefore, J € L5(Uo, U1). To distinguish from this space, we will denote by 
L (Ug, E) the space of operators from Up to E which are linear and bounded. For 
B € L(Uo, E), 


(1.30) Fil = BIz) = sup Bull. 


u€Ug, lul <1 


The following stochastic integral will be used in this paper: 


Í BG)dW(), — B()eL(Us, E). 


334 J. FENG 


Such an integral can be defined using telescoping Riemann summation just as the 
usual Itó integral in finite dimensions. Although the definition of W depends on U4, 
the integral is independent of the choice of U4. For details, see Chapter 4 of [9]. 
We identify an operator C in E by its graph: C C E x E, and denote the space 
of continuous functions on E by C(E). 
Our main result in the paper is the following: 


THEOREM 1.10. Let E, Uo be arbitrary separable real Hilbert spaces. Sup- 
pose the following condition holds. 


CONDITION 1.11. 


(1) Operator C, C E x E, Fa € C(E) and B,(x) € L2(Uo, E) are single valued 
and everywhere defined. That is, 


(C4) = D(F,) =E, ID(B,(x)) = Uo Vxc E. 
Moreover, they are globally Lipschitz: 


|Cnx — Cay) + Fs Go) — Fnr O) + MLB GO) — Bs QM ra quo, E) 


< Constant, ||x — yl, 


for every x, y € E, where the Constant, may depend on n. [See (1.29) for the 
definition of ||| - lll r;(us, £y] 

(2) C, is m-dissipative on E; C4,0 — 0 for n —2,3,...; and there exists a (pos- 
sibly multivalued) m-dissipative operator C C E x E, with D(C) = E such 
that C C lim, oo Cn, in the sense that for each (E, n) € C, there exists &, € E 
such that lim, |E — &|| + lm — Cn&nl = 0. 

(3) Whenever x, > xo € E, 





x E 2 
m m Ill Bn Go M rs us, Ey = 0. 
(4) For each x € E, there exist B(x) € L(Uo, E) and F € C(E) satisfying 


IBE) — BO) II + EF G0 — FO) as 
xy lx — yil 


kd 


where [|| - || is the usual operator norm in (1.30). Furthermore, for each xn, 
Pn € E and x, — xo, Pn — po, we have 


Fy (%n) > F(xo) and |B; (Xn) Pnilup > IB" (xo) Pollus- 


We also assume that X, is the solution to (1.7), and that {X,:n = 1,2,...} 
satisfies the following exponential compact containment condition: 


LARGE DEVIATIONS IN HILBERT SPACE 335 


CONDITION 1.12. For each compact K C E, T > 0 and a > O0, there exists 
another compact set Kg 7 C E such that 


1 
pub dum ~ log P(X, (t) € K; T, 40 <t < TIX,(0) =x) € —a. 


Hn— oo xcK 


Finally, we assume that (X,(0) : n = 1,2,...) satisfies the large deviation prin- 
ciple with good rate function 7o on E. 
Then: 


(a) {Xn} is exponentially tight; 
(b) the following limit exists and defines an operator semigroup on Cp(E) (the 
space of bounded continuous functions on E): 


| 1 n 
V(fG) =, lim log E[e Or |Xn(0) = y 


(c) the large deviation principle holds for (X,) with good rate function J: 


(1.31) I E sup (En GG lx Cti- D). 


Sh E XÍm 
where 


Lh(ylx) = En U (y) — Vt) f(@)). 


FECL(E 


Theorems 1.4, 1.7 and 1.9 are all iid cases of this theorem. 


1.4. Relation to other large deviation results in literature. 'The term C, 
in (1.7) and its limit C in Theorem 1.10 are allowed to be totally nonlinear. This is 
different than what is available in literature [4, 9, 24, 27, 29], where C is restricted 
to be semilinear. However, this paper does not pursue generalities in the term By, 
as some of the above mentioned papers do. 

If C, is semilinear, a good deal is known about the solution for (1.7). See [9]. 
However, if C, is just m-dissipative, very little is known for the equation. By 
assuming C, is Lipschitz and everywhere defined for each fixed n, we greatly 
simplified the situation. Such assumption is motivated by Examples 1.2 and 1.8, 
and by the fact that Yosida approximation of m-dissipative operators satisfies the 
above requirements. 

In this paper (1.7) is driven by a Brownian noise W, which is responsible for the 
quadratic nonlinear term in Ho, H1 (or the Ho, H 1). But in the proof of compar- 
ison principle, we actually allow much more general nonlinearity (Theorem 5.2). 
Therefore, it is possible that the method here can be applied to cases where the 
W is replaced by spatial Poisson noise. We expect exponential nonlinearity in 
the H;, B, 's in these cases. 


336 J. FENG 


1.5. Notation. We will frequently use the following class of test functions for 
localization purpose: 


— ec C*([0, 00)):g > 0, is nondecreasing, 
(1.32) 
and y(r) = 9(4-o0) for r large enough}. 


Throughout the paper, (£,r) and (U,ry) are complete separable metric 
spaces. Let f be a function on E. C(E) denotes continuous functions on E; 
C h(E), bounded continuous functions; B(E), bounded Borel measurable func- 
tions; M(E), Borel measurable functions; and P(E), probability measures on 
E. For f € M(E), we define f* and f, to be, respectively, the upper and lower 
semicontinuous smoothing of f: 


To= um fO), has lim int fO). 
Let O c RÊ: 


H'(0)z [se e r0): f Ix (9)? + |Vx(8)]? d0 < oo} 


and 
d 
(0) = stb e 16» 2 Ji»? + aix (80)? + laf j*(9)] ^46. 
i,j, k=l 


Throughout, (E, ||- ||) is areal separable Hilbert space with its dual identified as 
itself E* = E. C*(E) denotes the set of kth-order Fréchet differentiable functions 
on E with continuous kth-order derivative; we identify the kth-order derivative as a 
kth-order multilinear symmetric functional. For example, by Df (x), we mean the 
gradient of f evaluated at x, which is identified as an element of E* = E eqns 
the Taylor expansion: : 


(1.33) fœ +y) =f) + (Df (x), y) + 5D? f@yyt+ollyID, yee. 


D? f (x) yz means a functional which is bilinear in both the y and the z arguments. 
Let x, y € E; by x Q y, we mean a bounded linear operator on E 


(x & y)z & x(y, z). 


|-| will be used to denote either an absolute value of a number |a| or the Euclidean 
norm of a vector in R*: |(01,...,04)? = 34.02. 

We denote the range of a generic operator A in a Banach space by R(A) and 
its domain by D(A). We often identify an operator with its graph. À denotes the 
closure of the graph under the norm of the Banach space. Let E be a Hilbert space. 
A possibly multivalued nonlinear operator C C E x E is said to be m-dissipative 
if and only if it is dissipative: 


(x1 — x2, y1 — y2) L0  V(Gy)eC 


LARGE DEVIATIONS IN HILBERT SPACE 337 


and 
R(T —aC)-E Va > 0. 


If, in addition, D(C) = E, then C generates a strongly continuous contraction 
semigroup S(t) on E. The following test function on Æ is introduced to record 
trajectory properties of S(t)y. It plays a major role in the analysis of certain 
Hamilton—Jacobi equations (Section 5). 


DEFINITION 1.13. Let C be an m-dissipative operator on E generating a 
strongly continuous semigroup S(t) : t > 0. The Tataru distance function dc is 


dc (x, y) = inf{t + ||x — S(t)yl :£ > 0} Vx,ycE. 


dc (x, y) is Lipschitz ((28) on page 62 of [8]: 
ldc(x, y) — dc, 3)| € lx — Xl + ily — Jl. 


Let E be a general metric space again and let Ho, H1 C Cy(E) x B(E) be (pos- 
sibly multivalued) operators, h € C45 (E) and a > 0. We define viscosity solutions 
for 


(1.34) U —aHo)f =h 
and 
(1.35) ^. (I—aHi)f =h. 


DEFINITION 1.14 (Viscosity solution). 


(a) f is a viscosity subsolution of (1.34) if and only if f is bounded, upper semi- 
continuous, and for each (fo, go) € Ho, there exists an {x,} C E satisfying 


(1.36) rim, (Ff — fo)(%n) = sup( — fo)(x) 
and 
(1.37) lim sup(a~" (F — h) — (go)*)(%n) x 0. 


(b) f is a viscosity supersolution of (1.35) if and only if f is bounded, lower semi- 
continuous, and for each (fo, go) € H1, there exists an {x,} C E satisfying 


(1.38) slim, (fo — fn) = sup(f — fo) C) 
XE 
and 


(1.39) lim inf(a~"((f — h) — (80)+) On) = 0. 


338 J. FENG 


DEFINITION 1.15. We say a comparison principle holds for viscosity subso- 
lution of (1.34) and supersolution of (1.35) if 


f sf. 
for every subsolution f of (1.34) and supersolution f of (1.35). 


Allowing Ho, H1 C C(E) x M(E), Tataru [31, 32] and Crandall and Lions [8] 
define viscosity solution in a different manner. Definition 1.16 is an adaptation of 
their definitions when the domain of operator is chosen properly. We will explore 
the connection between such definition and the more general Definition 1.14, for 
equations arising in our large deviation context. See Section 3.4. 

DEFINITION 1.16 (Tataru-Crandall-Lions). 


(a) We say that f is a subsolution of (1.34), if f is bounded upper semicontinuous 
on E, and for each xo € E and fo € D(Ap) satisfying 


(1.40) (f — fo)xo) = sup(f — fo)(x), 
xE 
we have 


o (f — h) (x0) < (Ho fo)* (xo). 


(b) We say that f is a supersolution of (1.35), if f is bounded lower semicontin- 
uous on £, and for each xo € E and fo € D(H;) satisfying 


(1.41) (fo — f)(xo) = sup(fo - f)(x), 
XE 
we have 
o (f — h)(xo) = (Hi fo)« (xo). 


DEFINITION 1.17 (Exponential tightness and large deviation principle). Let 
S be a complete separable metric space and let {Xn} be S-valued random vari- 
ables. {X,,} is said to be exponentially tight if for every a > 0, there exists compact 
Ka C S such that 


I 
lim sup — log P (Xn € Kg) < —a. 
n—co H 


(X4) is said to satisfy the large deviation principle if there exists a lower semi- 
continuous function J : S — [0, +00] such that for every open set A C S, 


l 
— inf I (x) < liminf — log P(X, € A) 
xEÁ n ë pn 
and for every closed set B C S, 


1 ; 
lim sup — log P(X, € B) < — inf I (x). 
n n xeB 


LARGE DEVIATIONS IN HILBERT SPACE 339 


I is called the rate function and it is good if each level set is compact. 

Let E be a complete separable metric space. For each n, let stochastic 
process X, have state space E and let its trajectory be continuous in time. By 
large deviation (resp. exponential tightness) for the processes {X,,}, we apply the 
above definition with S = C [O, oo). 


DEFINITION 1.18. Let (£,q) be a metric space. D C C5 (E) is said to ap- 
proximate the metric q if for each compact K = E and z € K, there exists f, € D 
such that lim; oo SUPrex | fa(x) — qx, zZ) = 


2. A general large deviation theorem. This section presents a general theo- 
rem which is the basis for the large deviation method in this paper. The heuristics 
have been explained in Section 1.1. 

Let (E, r) be a complete separable metric space and let {X,,} be a sequence of 
E-valued processes with trajectories in C g[0, oo). Suppose A, C BCE) x B(E) is 
possibly multivalued. Let X, be a solution to the A,-martingale problem. That is, 
there is a filtration F;”, such that 


t 
SAD- SAO) | sQsG)ds YEE € As 
is a martingale. We will work under the following regularity condition. 


CONDITION 2.1. For each n = 2,3,..., let An C B(E) x B(E). We as- 
sume existence and uniqueness hold for the martingale problem for A, with 
Xn € Cg[0, +00) for each initial distribution u € P(E). Let P? € £(Cg[0, oo)) 
denote the distribution of the solution of the martingale problem for A, with 
X&4(0) = x € E; we assume that the mapping x — P” is Borel measurable tak- 
ing the weak topology on 2" (CgIO0, co)) (cf. Theorem 4.4.6 of [15]. 


Define H, C B(E,) x B(E,) by 
H,f — ae" Aue", — e e D(An), 
or if A, is multivalued, 
m= [f ret a): n e ns. 


The following is an exponential version of the uniform compact containment 
condition in the weak convergence theory. 


CONDITION 2.2. For each compact K C E, T >0 and. a > Q, there exists a 
compact Ky 7 C E such that 


1 
lim sup Sup = log P(X,(t) € Ka,r, for some 0 x t < T|X,(0) =x) x —a. 


n> oo 


340 J. FENG 


The following is an adaptation of Theorem 7.18 of [18]. In the adaptation, we 
also used a result of exponential tightness (Corollary 4.19), a variant of the Stone— 
Weierstrass theorem (Lemma A.8) and a technical estimate (Lemma 7.19). All the 
reference labels refer to [18]. 


THEOREM 2.3. Let Condition 2.1 be satisfied. In addition, we assume the 
following: 


(1) Convergence of generators. There exist Ho, H1 C Cpy(E) x B(E) which are 
limits of the H,'s in the following sense: 
(a) For each (f, g) € Ho, there exist some (fn, 8n) € Hn such that 


sup( sup |Jan (x)| + sup |8n 91) « oo, 
and that for each x, — xo, we have | 
Bm, fun) = f(xo),  limsup gn (xn) < g^ (x0). 
(b) For each (f, g) € H1, there exist some (fn, 8n) € Hn [possibly different 


than those in (a)] such that 


sup( sup | f, 621 + sup es OI) < oo. 
X X 
and that for each x, — xo, we have 
Um Inn) = f(xo), ^ g«(xo) X liminf gn (xn). 


There exist F C Cy(E) which approximate the metric q =r ^1 (Defi- 
nition 1.18), and for each f € F and X > 0, Af € D(Họ). 
(2) Uniform exponential compact containment. Condition 2.2 holds. 
(3) Comparison principle. There exist a subset D C C&(E) and ag > 0, such 
that for each h € D and 0 < a < ao, the comparison principle (Defini- 
tion 1.15) holds for subsolution (in the sense of Definition 1.14) of 


(I —aHo)f =h, 
and supersolution (Definition 1.14) of 
(I —aH)f =h. 


D contains an algebra that separates points and vanishes nowhere [i.e., for 
each x € E, there exists f belonging to this algebra such that f (x) 40]. 


Define (V4,(t)) on BCE) by 
VOSE) = - log Efe"! (|X, (9) = x] 


If (X, (0)) satisfies a large deviation principle in E with a good rate function Io, 
then: 


LARGE DEVIATIONS IN HILBERT SPACE 341 


(a) -limit 
VEF) = lim V; (5) f (xn) 


exists for every f € Cy (E), t4 > t, and x, — x. V (t) forms a nonlinear semi- 
group on Cy (E). 

(b) for each 0 € t € --- x t < oo, (X4 (t), ..., Xs(fty)) in = 1,2,...] is expo- 
nentially tight in E* and satisfies the large deviation principle with good rate 
function 


Is, (x1, sees Xk) 


= sup {[fiGi+---+fan 
fis KED 


(2.1) — Ao( V(t) (fi + V(to — t) 
| x (fa V(tg— t fe) -.-))} 


k 
= int [oo sa a Qilxi-) | 


i=l 


where 
EET ] nf (X4(0)) 
Ao(f) = lim -log Ele ] VWfecC,(£) 


[the limit exists by (1.1)], and 
(2.2) Ll)- sup (f()- V()fG)). 
f€Ci(E) 


(c) {Xn} is exponentially tight in C g[O, oo) and satisfies the large deviation prin- 
ciple with good rate function: 


I(x) s , sup Is... x (t... x (tg) 
(2.3) LS 


i ; 
= sup sup (i (x (0)) T > I fj] (x (ti) s) . 


k=1,2,...0<ty «to «- xt, j=] 


REMARK 2.4. In view of the duality in (1.2), the form of rate function in 
the first identity of (2.1) should be expected. The second equality follows by the 
Markovian property of the X,'s. The rate function (2.3) follows from the finite- 
dimensional large deviation result in (b), and from a well-known projective limit 
argument [10, 11, 28]. 

Large deviation behavior of the {X,,} and the exponential tightness imply that 
the rate function J is good. That is, J has compact level sets. See part (b) of 
Lemma 1.2.18 of [12]. Similarly, J,,,....4, has compact level sets in EK., 

Theorem 2.3 can be applied to situations other than small perturbation type 
problems; we refer the reader to [18] for further examples. 


342 ‘J. FENG 


In the rest of this paper we apply the above theorem to the general problem 
considered in Theorem 1.10. Step 1 is verified in Section 3.3; see Lemma 3.4. The 
condition in step 2 is assumed in Theorem 1.10 as Condition 1.12. It is verified for 
Examples 1.2, 1.5 and 1.8 in Section A.2. The comparison principle in step 3 is 
stated in Lemma 3.10, with details of the actual proof carried out in Section 5. 

Before closing this section, we illustrate how the classical Freidlin—Wentzell 
theory follows from Theorem 2.3. 


EXAMPLE 2.5 (The Freidlin-Wentzell theory). Let E = R4, and let W be 
a standard d-dimensional Brownian motion. Assume that bj, oij € C, (R2) are 
Lipschitz continuous for i, j = 1,...,d. We denote b(x) = (bi (x), ..., ba(x)): 
R® —> R®, and let d x d-matrix o (x) = (oij(x)). Let X, be the solution to 
I 


d X4(t) = b(X4(t)) dt + Ta 


ao (X5 (1)) aW (1). 


This is a special case of (1.7). 

Let D = {f : f = fo +c, fo € CG(R?), c € R} where Ce(R) is the collection 
of functions with compact support and with continuous derivative up to the second 
order. We denote by D^ f(x) = (8j; f(x)i the Hessian matrix of f. By Itó's 
formula, if we take 


I 
Anf (x) S bGOV f(x) + z- T(D'fG)e()o (6), Ff ED, 


then Condition 2.1 is satisfied. The transformed generator 


Hy f (2) = bO)V f) + slo GOV FW? + = THD? f Go (07 (9). 
f «D. 
If we let Hp = H1 = H with 
Af (x) -bG)VfG) + glo QOVfGOl, feD, 
then the convergence conditions in part (1) of Theorem 2.3 are satisfied. 


The assumptions on o, b imply that they grow at most linearly: 


d 


XO (loj GOL + bx G2) < c1 call. 
i,j,k=1 . : 


One can use such estimate to verify the uniform exponential compact containment 
condition in Theorem 2.3. This is shown in Example 4.23 of [18] using a stochastic 
Lyapunov function technique. | 

Let h € Cy(R) and a > 0; the comparison principle for 


(2.4) (l-aH)f=h 


LARGE DEVIATIONS IN HILBERT SPACE 343 


follows from results in [5]. Details on its proof can also be found in Chapters 
9.4 and 10.3 of [18]. 
Consequently, by Theorem 2.3, the large deviation principle holds for {Xn}. 
Let fe Iu(s)|? ds < --oo for each T > 0 and consider 


(2.5) x(t) = b(x(t)) 4- o (x(t))u(t). 
We define 


Ryh (xo) = sup| Í di eg 5 (a7 h(x(s)) — x |u(s)|) ds : (x, u) satisfies (2.5), 


(0) =x}. 


Then it can be shown that Rah € Cy (R^), it is the unique solution to (2.4), and 
L lim RE 
V (t)h(xo) = Lum. R, jh (xo) 


= sup} f. A \u(s)|? ds +- h(x(t)) : x = b(x) +0 (x)u, x(0) =x}. 


All these can be rigorously justified using the dynamic programming principle. 
Suppose that oT! (x) exists. Plug the above expression on V in (2.3) and (2.2); we 
obtain the simplified representation 


I(x) = Í "le! G3( — BO) ds. 


Such result is known as the Freidlin—Wentzell theory [23]. 

All the above claims are well-known results in control theory and first-order 
Hamilton—Jacobi equation literature. In [18], rigorous proofs are provided and 
summarized again. 


3. Large deviation for diffusions in Hilbert space. Throughout this section 
we assume Condition 1.11 holds and X, is a solution of (1.7). Both E and Uy are 
real separable Hilbert spaces. 


3.1. Semigroup on Hilbert spaces. We first recall some basic facts about semi- 
group on E. These facts will be used later in the paper. 

We assumed, in Condition 1.11(2), that C is m-dissipative on E, and D(C) = E. 
By Crandall and Liggett's [6] semigroup generation theorem, it generates a 
strongly continuous contraction semigroup on E: 


t wun FY 
S(t)x = lim (1 — “c) x Vx € E, 
n-oo n 


and 


ISEx — Syl < lx- yil Vt>0,x,yek. 


344 J. FENG 


Since (0,0) € C (or C0 = 0 if C is single valued), 0 = (I — aC)~!0, S(t)0 =0 
and | S(t)x|| < lx]. 


DEFINITION 3.1 (Canonical restriction of C). We denote 
|Cx| &inf(ly|:(x,»)€C] VxE D(C) 
and define a single-valued C? c E x E, called the canonical restriction of C, by 
C°x = (z: (x, z) € C, |Izl] = I|Cx l}. 


Then the following holds. 


LEMMA 3.2. 


(1) D(C?) = D(C) and C9 is single valued (Lemma 2.19 of [26]). 
(2) C? is the infinitesimal generator of S(t) in the sense that 


1 * 
Ok: id " 
(3.1) C x= a p (S(h)x — x), x € D(C) 


(Corollary 4.19 in [26]). 
(3) Let f € C (E); then 
(S(r)&) — fE) 
G2 (Df &),C%) = tim LOPE ve cD). 
DEFINITION 3.3 [Directional derivative along the trajectory of S(t)]. Suppose 
f € C(E) is Lipschitz continuous. We define 


l 
DE fœ) = _jimsup 7, (FS@)y) — FO) 
h—0 yox 


and 


1 
De f(x) =, liminf | ; £600») — fO). 


Dé f: E-[—0o, +00] is upper semicontinuous, and Do f : E — [—00, +00] 
is lower semicontinuous (Lemma 2.3 in [8]). 

We list two useful properties of the Tataru distance function dc (Defini- 
tion 1.13): 


(3.3) dc(x. y) - dc($, y) < lx — XI + lly — yl 


and 


(3.4) ACORDE ee» <1, Dédc(., y) <1. 


See page 62 of [8] for proof. 


LARGE DEVIATIONS IN HILBERT SPACE 345 


3.2. The martingale problem. Recall that E is a real separable Hilbert space, 
and that C, is single valued, everywhere defined and Lipschitz on the E (C, is 
usually some regularization of the C). Let f € C?(E) be such that Df (x) =0 
when ||x | is sufficiently large; we define linear operator A, C Cp (E) x B(E) by 


(3.5) An f(x) = (Df (x), Crx + Fa (x) + = hID^f (x) Bn (X) B, (x)]. 


By Condition 1.11, an infinite-dimensional version of the Itó formula applies 
(e.g., Theorem 4.17 of [9]). In addition, Condition 1.11(1) is a strong enough as- 
sumption so that the results in Chapter. 9 of [9] (regarding Markov property and 
regularity for the initial conditions) apply. Therefore Condition 2.1 is satisfied. 

We next compute nonlinear operator H, C C,(E) x B(E) by 


1 
H, f (x) & "ud Ane” (x) 


— Tei (x) (De! (x), Cx + Fy(X)) 


1 
E aad ©) Tr[ D? e" (x) B, (x) B* (x)] 


(3.9 = (Df (x), Cux + Fs (x) 
+ 529 DGf)6) @ DA + Def) G))B, 6) BE) 
= (Df (x), Cox + F0) 

+ 5182 GOD ON, + 5- THLD? f 6) Bs G) BE] 


where (x @ y)z = x(y,z). The last step above needs some justification: let 
(61, ..., êk, ...) bea complete orthonormal basis of E. Then 


—5 Tr| (D(nf)(x) & D(nf)()) B. (x) B; (x)] 


(Df (x) & Df (x)) Bh(x) B; (x)&, à) 


m 
il 
— 


M2 


(Df (x), êk) (Df (x), Bn(x) By (x) ex) 


*- 
E. 


i 
N | ma 
Me 


- S (Df GO. B (x) BR (x) Df (x)) 


1 
= 5B ODF) ley: 


346 J. FENG 


3.3. Convergence of the H,'s. Formally, we expect the limit of Ha f to be 
given by 


Hf (x) = (Df (x), Cx + FG) + HIB ODEO. 


However, the above does not make sense for x € D(C). As commented in Sec- 
tion 1.1, we have to replace H by Ho, H1; then by selecting test functions f care- 
fully, we can estimate the limit from above by Ho f and from below by Hı f. The 
class of test functions has to be large enough so that the comparison principle 
(Sections 3.4 and 5) can be proved. 

This is what we will carry out rigorously next. 

By Condition 1.11, C, and C generate, respectively, strongly continuous con- 
traction semigroup $,(t) and S(t) on E, S,(t)0 — 0, S(t)0 = 0 and |S, x] < 
xl], Sx] x Hx |l. 

We derive the limit operators Ho, Hı in Theorem 2.3 through several steps. 

First, recall definitions of the canonical restriction of C in Definition 3.1 and 
of the Tataru distance function dc in Definition 1.13. dc is Lipschitz; however, 
it may not be differentiable in x. We introduce smooth approximations of it first. 

By Condition 1.11, both C, and C are m-dissipative, and C C lim, Ch. By 
the Crandall-Liggett semigroup convergence theorem ([6]; see also Theorem 6.8 
of [26]), 


lim sup |S,(¢)y — S(t)yll =O VycE,T >0. 
n 000 T 


Let limy-+9 an = 00; we define 


r—e (r—e) 
2/6 Beye 
(8) — h&yQ) = inf[t + és (llx — S@)yI")}, 


(3.7) aps (ve + Jro Speer Oa 


89) Iney(t) = ——-log | © e antite ds 80512) qr 
Ay, 0 
Then by Lemma A.12, 
lim sup jA — dc(x, y)| =0 
E e, y (X) c, y)| 
and 
pu Ihe y (x) = hn,s,y(x)| == 0) 


for each compact K C E. Later, we may drop the y in the subindex if no confusion 
can occur. 
Recall the definition of T in (1.32). We now define Ho and H1: 


LARGE DEVIATIONS IN HILBERT SPACE 347 
(a) Let 
D(Ho) = { f(x): F(x) = gilllx — Ell?) + e (ha y, Q2) +--+ + ei (a, Q9). 
Ve eT,€E e D(C), yj EE ESL] 


For g(x) = pı (lx — £l’) and f(x) = g(x) - 9215, y 60) +--+ + pkt 016, (2) € 
D(Ho), we define 


Hof (x) = 29} (IIx — £l, (x — £, CPE) 


+ (sup eir) +++ + sup Gh, ©) 
r>0 r>0 


(3.10) 
Be sup ((F(x), De) +4) 
llle) Ge, y, 0) Ph hey) 
| 2 
+ 3|B*(x)(Dg(x) +a) lin) 


By item (1) of Lemma A.12, and the fact that Dg(x) = 0 when ||x|| is sufficiently 
large, we have 


sup sup (F(x), Dg) +4) 
xeE lale», y, 09-9, (he y, (x) 


1| B*GO(DgQ) +4) lp, < oo. 
Consequently, Hp C Cp(E) x BCE). 


(b) Let 
D(A) = {f Œ): f(x) = -gll — £l?) 
— Ppa(he y (x)) — +++ — eii (e, 02), 
Vo; ET, € D(C), yj € E,k=1,2,...}. 
Let f(x) = g(x) — p2(he, y (x)) — +++ — pr+ (he, yy 0) € D(H1), where g(x) = 


—1 (lx — £17). We define 
Hi f (x) = —29 (lx — ||?) (x — £, C?&) 


— (sup q»(r) Tex sup ¢41(7)) 
r>0 r>0 
(3.11) . 
inf — (F(x), Dg(x) +q) 
lla llb (hey, Htp eyn Q2) 
2 
+ 4| B*(x)(Dg(x) +a) lin). 


The reason for using g; is to localize the test function f and the Ho f, Hi f so 
that Ho, H1 C C5 (E) x B(E), a condition required by Theorem 2.3. This is the 
main reason that the two operators have such complicated forms, instead of the 


348 J. FENG 


simpler forms used by Crandall and Lions [8] using the Dé in Definition 3.3. We 
note that Hp and H1 are both single valued. 
Let 


(3.12) . F-lfG)-wwlix—-£l),0€7,&e2(C)) C D(A). 


Then F approximates the metric g(x, y) = [|x — y|| ^ 1. In addition, for A > 0, if 
f € D(Ho), then Af € D(Ho). 


LEMMA 3.4. 
(1) For each f € D(Ap), there exists f, € D(H,) such that 
sup sup(| fn()! + | Hn fa(x)l) < oo 
and 
Am Jn (Xn) wes f (x9), 


lim sup Hy fn(Xn) < (Ho f)" (xo) 


n doo 


whenever x, — xo. 
(2) For each f € D(A), there exists f, € D(Hn) such that 


sup sup(| fn (x)| T (Hn fa (x)1) « oo 
and 
lim fn(%n) = f (xo), 
n—> -+00 
lim inf H, fnn) = (H1 f )«(xo) 
n oo 
whenever x, — xo. 


PROOF, Letus present the proof for Ho only; the case for H; is similar. To fur- 
ther simplify, let us just verify the case for test functions in D(A) of the form 


f (x) = g(x) + es, (x)) = ex (lx — EI) + 2s, ()) € D(H), 
where A; , is defined by (3.8). By Condition 1.11, there exists E, € E such that 
" " 0 u e 
lim JIE — &l| + C96 — C, | — 0. 
Let an > 1 satisfy lim, o5 dn = +00. We define hn,e,y according to (3.9). Let 


£x (x) = vix — & ^), 
fa) = gn (x) + G2(hn,e,y(x)) € DD HS). 


LARGE DEVIATIONS IN HILBERT SPACE 349 


By part 3 of Lemma A.12, 
lim sup |fa(x) — f(x) =0 for each K C E compact. 
n-> OO xcK 


Apply (3.2), (A.18) and (A.20) to (3.6): 
Hn fa (x) < 29) (lix — Enl ) (x — Ens Cn£n) 
"e E q»(r) + (Fn Œ), D(8n + g2 ohn e, y) (x)) 
: | 
MEDIA 









1 
2n 


1 : 
m 2n Tr[ D^ (o2 o hy e, y, ) (X) 


(3.13) < 2g} (lx — & l^) (x — Ens CnEn)} 


"T Tr[ D^ gn (x) Bn (x) B* (x) 





+ sup (ue. Dg 
lg le, n,e, y X) 


1 
d 51 B7 (x 
1 
Po Tr[D7 gn (x) Bn (x) B? (x 


1 
T an Tr[D? (p20 Imus, y) X) Bi 
where 
(3.14) Dan (x) —-2e(lix — &ll^) (x — én), 
(3.15) — D?g,(x) = 20} (lx — Enll?)Z + 207 (lx 


Let x, — xo, and denote 





1 
bn = — II Bn (xn INE (Uo, Hh 


Then 6, — 0 according to Condition 1.11(3). "ade da = n 1/2 to be the one 
in (3.9), then a, — -Foo and 


l 
-an| Bn Ga MT suo, E) = nón = 85^ > 0. 
By (A.21), 


(3.16) lim = Tr[ D^ (go o hy e y) (Xn) Bn Xn) Bs (Xn)] =). 


n—oo2 


350 J. FENG 
Therefore, by (3.13) through (3.16), 
lim sup Hn fn (x4) < (Ho f)* (xo) 
n-—--oo 


whenever x, > xo € E. O 


3.4. The comparison principle. Let a > 0, and let h € Cp(E) be uniformly 
continuous on E. The main goal of this subsection is to prove the comparison 
principle in Lemma 3.10. 

In what follows, we extend the operator Ho, Hı and connect Feng and Kurtz's 
definition of viscosity solution (Definition 1.14) with those in [31, 32] and [8]. 
We will introduce a new set of operators Ho, Hi, Ho, A, and will denote Ho, Hi 
closures of Ho, Hı under the graph norm topology in B(£). We will clarify the 
relations among the next four sets of equations: 


(3.17) (I — a Ho) f =h, 
(3.18) (I —aH\) f =h; 
(3.19) (I — Ho) f =h, 
(3.20) (I —-aHji)f =h; 
(3.21) (I —a Hg) f =h, 
(3.22) XQ —-aH)f =h 
and 

(3.23) (I ~aHo) f =h, 
(3.24) (I — aH) f =h. 


Let g; € T [see (1.32)], y; € E and $ € D(C), 
g(x) = vi (Ilx — £^), 
F(x) = g(x) + g2(dc(x, y1)) +--+ pr (de(, yi); 


we define single-valued operator 
Ho f (x) = 2e (IIx — £l?) (x — £, C26) 


T (sup g5(r) epos epe sup Py O) 


> 
(3.25) : 
B sup ((F(x), Dg) +q) 
lglg (det, y1) 4 0a (dey) 


+ IB 60 (DG) + a)lz,)- 


LARGE DEVIATIONS IN HILBERT SPACE 351 


By item (1) in Lemma A.12, (3.10) is equal to (3.25) when ||x || is sufficiently large, 
independent of the &. In addition, by (A.17), 


lim. sup lhe y (x) — dc(x, y)| — 0. 


£0 


Therefore sending e — 0, we obtain Ho C Ho, where Ho is the closure of Ho under 
the uniform norm for B(E). Similarly, let 

g(x) = —gi(llx — €1), 

f (x) = g(x) — (vex(dcG. y1)) + +++ + exa (de, yi))) 
and define 


Hi f (x) = —29} (|x — £I? (x — £, C°€) 


= (sup par) +--+ ie e) 
(3.26) = -= 


inf | ((F (x), Dg(x) +q) 
Na SG (de, yD +O 1 (4c Qr yx) 


+4] B*(x)(De(x) + a)l) 
Then Hi C Hi. 


LEMMA 3.5. f isa viscosity subsolution of (3.17) for Ho if and only if it isa 
viscosity subsolution of (3.19) for Ho; both imply f is also a viscosity subsolution 
of (3.21) for Ho. 

f is a viscosity supersolution of (3.18) for Hi if and only if it is a viscos- 
ity supersolution of (3.20) for H1; both imply f is alsoa VVHCOSHY supersolution 
of (3.22) for Hi. 

Hence, the comparison principle for subsolution of (3.21) and supersolution 
of (3.22) implies the comparison principles for (3.17) and (3.18), as well as those 
for (3.19) and (3.20). 

In this lemma viscosity solution is always meant in the sense of Definition 1.14. 


PROOF. The conclusion follows from the fact that Ho C Ho, Hi c A; and 
the definition of viscosity solution in Definition 1.14. O 


We discuss some properties enjoyed by functions in D(H;), i= 0,1. 


LEMMA 3.6. Let fo € D(H). Suppose xg € E satisfies (f — fo)(xo) = 
sup eg (f — fo)(x). Let 0 < p € C"([0, o0)) be nondecreasing, (r) =r when 
r € ] and g(r) =2 when r > 2. Let 0 > 0. We introduce perturbation of fo: 
(3.27) -— fe = fox) + 6o(dc(x, xo)). 


Then fe has the following properties: 


352 J. FENG 


(a) fo € D(Ho) and 
Cf — fo)(xo) > Cf — fo)(x), x Æ x. 
(b) For any {xn} C E satisfying 
im, (f — fo)G) = supCf — f)0. 
we have x, — xo and f (Xn) — f (xo). 
(c) 
p sup Ho fo)” (xo) < (Ho fo)" (xo), 
lim infCH fg). Gro) = (Hh fo)« (xo). 
PROOF. Part (a) follows from the definition of fa. 
We prove part (b) next. By (a), 
limCf — fe)Qn) = sup(f — fo)(x) = Cf — foxo) = Cf — fo) (xo). 


Therefore 


(f — fo)(xo) = Cf — fe)xo) = lim(f — fo)(Xn) 
= lim(f — fo)(%n) — 99(dc (xn, xo)) 


< liminf(f — fo) Ga) = (f — fo) (o). 
Hence 
lim 6dc (xs, xo) = lim (Cf — fo) (en) — (F — fo) &n) — 6e(dc (xn, xo)))] 
zm. 
which implies x, — xo and (f — fo)(%n) > (Cf — fo)(xo). 
Part (c) follows from direct verification. Q 
LEMMA 3.7. 


(a) If f is a viscosity subsolution of (3.21) in the sense of Definition 1.14, then it 
is also a viscosity subsolution in the sense of Definition 1.16 for Ho. 
(b) If f is a viscosity supersolution of (3.22) in the sense of Definition 1.14, then 


it is also a viscosity supersolution in the sense of Definition 1.16 for A. 


PROOF. We prove part (a) only. The proof for part (b) is similar. Let g; € 7 
[see (1.32)], y; € E and £ € D(C). Consider 
go(x) = g1 (lix — £^), 
fo(x) = go(x) + ex(dc Gi, y1)) +- + exa (dc (x, yi)) € : (Ho), 


LARGE DEVIATIONS IN HILBERT SPACE 353 
and xg € E such that (f — fo) (xo) = SUP, ez (f — fo)(x). Define fg according 
to (3.27). By Lemma 3.6, fg € O( Ho) and 

(f — fe)(xo) > Cf — fE), x e xo. 


Since f is a subsolution of (3.21) in the sense of Definition 1.14, there exists a 
sequence {xn} C E such that 


lim(f — fo)(%n) = sup( f — fo)(x) 
and 
lim: supa "(f — h)(xn) — (Ho fo)” ()) x 0. 


By Lemma 3.6, x, > xo, f (Xn) > f (xo) and 
lim sup(Ho fo)” (xo) < (Ho fo)" (xo). 
004 
Hence 
a (f —h)(xo) < lim sup (Ho fo)" (xo) < (Ho fo)* (xo). 


We define Ho and A; below. For 


g(x) = 7x - Elf Vu» 0,£ e D(C), 
(3.28) 

f)=e(x)+pdc(x,y) — VyeE,p-0 
(recall the definition of dc in Definition 1.13), we define 


Hof (x) = u(x —&, C8) +p 


(3.29) 
+ sup (FO), DeG) +4) + 3] GO(Dg GO +4) lu): 
glo | 

similarly, for 

g(x) =—F le -El Vu »0,£ e D(C), 
(3.30) 

f(x)=g%)—pdc(x,y) |VyeE,p-0, 
we define 

Ay f(x) = —u(x —&, C°E) — p 

(3.31) 


+ int (FG), DgG) +4) + 4 | B*(x)(Dg(x) +a) li). 


Ho, Hy are local operators, therefore we can get rid of the localization functions 
Pk ET to arrive at Ho, H1. We omit the proof here. Such argument is standard. In 
Lemma 6.1 and Theorem 6.1 of (20], these types of arguments are used to prove 
equivalence of different definitions of viscosity solution. We have the following 
conclusion. 


354 J. FENG 


LEMMA 3.8. 1f f is the viscosity subsolution of (3.21) for Ho, then it is also 
the subsolution of (3.23) for Ho; both in the sense of Definition 1.16. 


If f is the viscosity supersolution of (3.22) for Hi, then it is also the superso- 
lution of (3.24) for Hy; both in the sense of Definition 1.16. 


Summarizing conclusions in Lemmas 3.5, 3.7 and 3.8, we have the next result. 


LEMMA 3.9. Let f be a subsolution to (3.17) for Ho in the sense of Defini- 
tion 1.14 (Feng and Kurtz); then it is a subsolution to (3.23) for Ho in the sense of 
Definition 1.16 (Tataru—Crandall—Lions). 

Let f be a supersolution to (3.18) for H; in the sense of Definition 1.14 (Feng 
and Kurtz); then it is a supersolution to (3.24) for A in the sense of Definition 1.16 
(Tataru--Crandall—Lions). 


We will study the comparison principle for viscosity solutions in the sense of 
Definition 1.16 in Section 5. In view of Lemma 3.9, Theorem 5.1 implies the fol- 
lowing. 


LEMMA 3.10. Let f be a subsolution to (3.17) for Ho and let f be a super- 
solution to (3.18) for Hy, both in the sense of Definition 1.14. Then f < f. 


3.5. The large deviation theorem. 


THEOREM 3.11. Suppose Conditions 1.11 and 1.12 are satisfied. Let X, € 
C Eg[0, co) be the solution of (1.7). Suppose further that {X,,(0)} satisfies the large 
deviation principle with good rate function Ig on E. 

Then: | 
(a) (X4) is exponentially tight; 
(b) the following limit exists and defines an operator semigroup on Cy (E): 


I 
(3.32) V(t) f(x) = lim -log Efe"? |x, (0) = x]; 
n>n 
(c) the large deviation principle holds for (X4) with good rate function I: 
n 
(3.33) I(x)-1o(x(0)-- sup (X Is -u a)l wv). 
0xt S--<tm \ i=l 


where 


L(ylx)= sup (fO) — V(0f69). 


fECh(E) 


PROOF. Define F C Cp(E) according to (3.12). The operator convergence in 
Lemma 3.4 and the comparison principle in Lemma 3.10 imply that Theorem 2.3 
holds. Consequently the conclusion follows. L] 


LARGE DEVIATIONS IN HILBERT SPACE 355 


'4. Application to special cases. We solve Examples 1.2, 1.5 and 1.8 as spe- 
cial cases of Theorem 3.11. 


4.1. Stochastic Allen-Cahn equation. Recall that we take E = Up = L?(0). 
Let c = sup, |V"(r)|. We take 


(4.1) (Cx)(6) = Ax(@) — V'(x()) + V'O) — ax (8) 


where 





xeL^(0)i,j-1 


3 
D(C) = H” (0 =| X, —X, sd] 
Vest eee redeo 000,00; 


and 
F(x) 2 —V'(0) + ox. 


C is m-dissipative by the usual theory of semilinear equation. Recall the C, and Fn 
in (1.18); we have 
um ICn5 — CSI r2(o) =0 V& € D(C), 


(42) | | 
um. Fn (xn) = F(xo). 


In addition, let B, (x) be defined according to (1.16); then 


po | Pn Xn) Pn | L?(0) 77 | B* (xo) poll L?(9)- 


lim 
Xn — X(), Pno? 


Let {e1,...,ez,...} be the orthonormal system for Up = L*(Q) as defined 
in (1.11). Let c, o be defined according to (1.9). Since 


Il Bn Gon MI. quis, y = Tr(B En) Bn Xn) = X Bn Qe? 
k 


=E Y dec (Pm sen ei)? 


on Mn) 
= » POC, (Pans Eei, ed] 


iz(1,.,1) k 


(Mn , vn Hg) 


"m 2: lo, (Paxn, Eei ll? < ma SP q^ (0, r), 


Condition 1.11(3) holds under the scaling requirement (1.14). 
Finally, Condition 1.12 is verified by Lemma A.5. Therefore, Theorem 1.4 fol- 
lows from Theorem 3.11. 


356 J. FENG 


To simplify the form of the rate function from (1.31) to a time integral form 
as in (1.19), we need additional work. The basic idea is the same as that pre- 
sented in Example 2.5, with some technical complications because of the infinite- 
dimensional state space. A general result for rate function representation is 
developed in Chapter 8 of [18]. Applying such result, representation (1.19) for lat- 
tice versions of the stochastic Allen-Cahn equation is proved rigorously in Chap- 
ter 13 of [18]. This procedure can be carried out similarly here. Below, we only 
provide a sketch. 

First, the form of Ho, H 1 in (3.29) and (3.31) induces an optimal controlled 
PDE problem: 


aut 0) = Cx + F(x) + B(x)u(t) 
(4.3) 


= Ax(t,0) — V'(x(t, 0)) 4- o (x, 0)u(t, 0) 


which is well defined under a finite running cost assumption: 


oo 
Af [ 60.0 aat < oo. 
0 (v) 


By the dynamic programming principle and the comparison principle for (3.23) 
and (3.24), we can prove that the V (1) in (3.32) has the form 


VO) (9) = sup] f) 
-4f I u^ (s, 0) d8 ds : (x, u) satisfies (4.3) and x(0) = x0}. 
Then from this, we derive 
1,(x1|a0) = inf f J, 1u? (t, 0) dO ds| 
(x, u) satisfies (4.3) with x (0) = xo, x(t) = a]. 
Combine the above form with (4.3); the variational form of J (x) in (1.19) follows. 


4.2. Stochastic Cahn-Hilliard equation. Let œw = sup, |V"(r)|?/4. We define 


(4.4) (Cx)(8) = A(—Ax(0) + V'(x(8))) — ex(0) 
for 
x € D(C)  W^?^(Q9) 


— 


0 97? 9? 34 


= db sequi a a Cr MID AOR aA A «1| 
E x 96," 90,00)" 90,80; 86, 90,00; 00,00, °° O 


LARGE DEVIATIONS IN HILBERT SPACE 357 


and 
(F(x)(0)zmox(0) | xe€L'(0). 


: We recall that the C,, F, are defined as in (1.23). By Lemma A.1, C and Cn are 
m-dissipative in L^(9). Furthermore, the type of convergence in (4.2) holds here 
by direct verification. 

By Lemma A.7, Condition 1.12 is also satisfied. 

Theorem 1.7 follows from Theorem 3.11. 

The rate function representation in (1.24) can be proved using similar arguments 
as in the Allen-Cahn case. The controlled PDE becomes 


3 | 
3 € 0) = Cx + F(x) +-B(x)u(t) = A(—Ax(t,0) + V'(x(t,0))) 3- u(1,0). 
The running cost structure 1s the same. 


4.3. Stochastic quasilinear equation with viscosity. Let C, be defined accord- 
ing to (1.27) and let 


(4.5) Cx-aAex—8üpó(x)—ox, xe H*(0). 


Then both C, and C are m-dissipative operators in L*(Q) (Lemma A.4). 
The compact containment estimate is provided in Lemma A.9. Following the 
same arguments as above, Theorem 1.9 follows as a special case of Theorem 3.11. 
Rate function representation is the same as the Allen-Cahn case. The controlled 
PDE is 


ax 8) = Cx + F(x) + BG)u(t) =a 82x (t, 0) — doG(x(t, 0)) + u(t, 0). 


5. A class of Hamilton—Jacobi equation in Hilbert space. The purpose of 
this section is to present a self-contained proof of the comparison principle for 
(5.2) and (5.3) in the viscosity solution sense by Tataru-Crandall-Lions (Defini- 
tion 1.16)—Theorems 5.1 and 5.2. The whole section is independent of the rest of 
the paper and can be read separately. 

We point out that the comparison results in [31, 32] and in [8] cannot be directly 
borrowed here, because the Ho, A, are not exactly of the same form as considered 
there. For example, when defining these operators, we restrict the domains and 
use rougher estimates than the D and Dg (Definition 3.3). This allows us to 


relate Ho, H with other operators which arise as the kind of limits required by 
Theorem 2.3 with graphs contained in C,(E) x BCE). Second but more impor- 
tantly, the quadratic nonlinearity in (5.6) is worse than that assumed in (ii) of (49) 
in [8]. We explore convexity to cure this problem. Despite these differences, the 
main ideas of [8, 31, 32], still apply and all we need are modifications and refine- 
ments at various places. We follow [8] and present the proof through a doubling 


358 J. FENG 


technique—Lemma 5.9. To make this paper self-contained, we will repeat the im- 
portant steps of [8] and omit minor details. We make detailed references for all 
omitted steps so that they can easily be recovered if needed. 

Let E be aseparable real Hilbert space, let C C E x E bea possibly multivalued 
nonlinear m-dissipative operator with D(C) = E and (0, 0) € C. Further suppose 
function F : E — E and operator B(x): E — E for each x € E satisfy 


ee B(x) — B 
(5.1) epg FOIE BOO OUR 
xXy Ix — y| 
Recall the definition of Ho, Hy. C C(E) x M(E) in (3.29) and (3.31); we con- 


sider the following Hamilton-Jacobi eduanons written in the resolvent form: let 
h € C,(E) and a > 0, 





(5.2) (IaH) f =h 
and 

(5.3) (I — aH) f =h. 
We prove the following. 


THEOREM 5.1 (Comparison principle). Let f bea subsolution of (5.2) and f 
be a supersolution of (5.3), both in the sense of Definition 1.16. 
Suppose h is uniformly continuous and (5.1) is satisfied. Then 


Poy. 


Indeed, we will prove a theorem covering more general situations. Let G(x, p) € 
C(E x E). We define single-valued operators Ho, Hi C C(E) x M(E): foreach f 
in (3.28), we define 


(5.4) Ho f (x) = u(x — £ CE) + p + b G(x, Dg(x) +4). 
qiisp 


For each f in (3.30), we define 


65 — Af) =—we—§,C°§) -p+ inf. G(x, De(x) +4). 
The operators in (5.2) and (5.3) are special cases of the above ones with G given 
by 


(5.6) | G(x, p) = (F(x), p) + HIB Opl. 


THEOREM 5.2. Suppose a > 0, h is uniformly continuous and Condition 5.5 
is satisfied for G. Define Ho, Fi according to (5.4) and (5.5). 

Let f be a subsolution of (5.2) and f be a supersolution of (5.3), both in the 
sense of Definition 1.16. 

nen 


Fef- 


LARGE DEVIATIONS IN HILBERT SPACE 359 


. 5.1. Perturbed optimization principle. There is no a priori guarantee that the 
extrema in (1.40) and (1.41) of Definition 1.16 always exist. We have to care- 
fully choose test functions fo to make sure that the definition is not an empty one. 
Ekeland's perturbed optimization principle [14] claims that, if we add a small per- 
turbation to the test function using the norm of the Hilbert space, we can always 
attain the extrema. If we apply this technique in the viscosity solution context, we 
also want the perturbed H fo to be close to the unperturbed one, so that the equa- 
tion we consider does not change much. These considerations lead to the Tataru 
distance function dc in Definition 1.13. 

The following is adapted from Proposition 2.1 of [31], which generalizes 
Ekeland's principle. In the adaptation, we have taken g = —u, ne u is the func- 
tion in the original proposition. 


LEMMA 5.3. Let K be an abstract set and B: K x K — [0, +00), with the 
following properties: 


(a) B(x,x)=O/forallxe K; 

(b B(x, y)+ BY, z) z B(x,z)forallx,y,zeK; 

(c) for each {xn} C K satisfying 35-4 Bn, Xn4+1) < oo, there exists x € K such 
that lim, coo B (xs, x) = 0. 


Let g: K — [—00, +00), sup, y 2(x) < +00. Furthermore, if {xn} C K and 
x € K satisfy 354 B(Xn, Xn41) < +00 and iMr >+ B(xs, x) = 0, then g(x) > 
lim sup g (xn). 

Then, for each £ > 0, and xo such that g(xo) 4 —oo, there exists x; such that: 


(1) g(xo) + &B(xo, xe) < g(Xe), 
(2) g(x) — £B(x, xe) E g(xe), x € K. 


REMARK 5.4. Let function g: E — R be bounded and upper semicontinuous 
in the norm topology of E. If we take K — E and B — dc, then the assumptions 
regarding B are satisfied (Proposition 2.2 of [31]), and: 


(a) For each xo € E and £ > 0, there exists an x1 € E such that 


(5.7) g (xo) + edc (xo, x1) € g(x1) 
and 
(5.8) g(x) — edc(x, x1) € g(x1), x€E. 


(b) Let € > 0 and xo € E be such that 
sup g(x) < g(xo) +e”. 
xcE 


Then there exists x, € E such that not only (5.7) and (5.8) hold, but also 


dc(xo, x1) € £. 


360 J. FENG 


Part (a) is a consequence of the above proposition. See also Lemmas 2.4, 2.5 of [8]. 
Part (b) follows from (5.7): 


edc (xo, x1) € 8(x1) — (xo) < g(x1) — sup g(x) +e? « e?. 
xE 


Similarly, an analogous result holds when we take K = E x E and B((x1, y1), 
(x2, y2)) = dc (xi, x2) + de (y1, y2). We will need such result for (5.14). 


5.2. The comparison principle. We make the following structural assumption 
about G. | 


CONDITION 5.5. 


(1) G(x, p) € C(E x E); for each A > 1 and M > O fixed, there exist o; (r), 
cM (r) € C(R*) with p, (0) = 0 and oy (0) = 0 such that 


Ax — y) 


G(x, 
À 


)-66.n6-») <ox(le—yll+ullx—yI?), — n0, 
and 


sup IG(x, p) — G(x, q)| < om (r). 
i+ pii--lall «M.la-— pir 
(2) There exists a nondecreasing function 0 < yg € C'!([0, 00)) slowly grow- 
ing to infinity in the sense that lim, , g(r) = oo and sup,.o |rg’(r7)| + 
Ire'(r)| < oo. yg’ > 0. y d] 
For M > 0, X > 1, there exists y}, m (r) € C(R*) with y}, m (0) = 0 such 
that 


p + e9' (llxli^)x 


sup G(x, 7 


)- 0, P) x». 
xéE,||p||<M 


REMARK 5.6. Condition 5.5 implies that for any 1 < Ao < A1, 


EN 2 w(x — 2) 


A1G1 x, 
G(x Ao 


H 
Uu (ix -yl + Eds - y1?) 


and 


+ eg! (Ix ||^)x € 
sup (G(x. P AS oa j = wG(x, 2)) < Vrae ar): 
x€E,lplM Ài ro ho 


g(r) = log(1 + r) satisfies the growth requirement in the condition. For many 
examples, such choice is good enough. 


LARGE DEVIATIONS IN HILBERT SPACE 361 


LEMMA 5.7. Suppose (5.1) is satisfied; then the G in (5.6) satisfies Condi- 
tion 5.5. 


PROOF. To verify this, let A > 1; noting 
1 "N^ Lan b^ 
T--—1 —bx + -b° < ——, 
(s ) LU dE ud mur 
we have 


G(x, 2) — G(y, p) = 


17/1 
(5 |B*(x) Pll — 1B*() Plt) + (FG) — FO), p) 


= 


Nie bd] 


1 1 
(5 |(B*G) — B*(y)) pla, + (5 » 1) IB O Plin 


2 
+=] (B*(@) — B*O)) Ply, 18°) plu) 


t Lralx — yllilpll 
11(*() — B*(y)) Plin 
M A ru E PE fF 
b AÀ—1 
< lGrslx — yl lp 1? 
m2 A—1 
where L r.p is the one in (5.1). We can take 


LI? 
n si + LF, pr. 


+ Lr,Blix — yilllpll 


+ Lr pllx — yililpll, 





1 
px) = 5 
Similarly, denoting Co = ||| BO)II, 





+E 1 1 1 
e(a PZI) - Gee, p)  z((z - JIB GO pl, + iI eal, 
2 
+ ZIB" GOpllu lB eq luo) 
+ ell F GO gi 
1 18*G9edll2;, 
< 2-59 07 + eL r.nplxilligi| 
e^ (C +L xl)? 
<2 (Colla + Eia MED? | oy vena, 
2 À -—1 
We can take g(r) = log(1-- r) and 
£? (Co + L)? 
ya m (€) = ERE +eLp B. 


2 XÀ-1 


362 J. FENG 


Finally, for each M > 0, we denote 
Ci(M) = sup IIBE), 
lxi «M 


C2(M)= sup IB M, 
lx «M 


C3(M)= sup |F). 
lx <M 


Then for |x| + Ipil + llall < M, 
IG(x, p) — GG. q)| < (IB*GO(p — DIF, 
21 B" G)41lujl B" GO (p — alus + IF GOL P — al) 
< CEUD)IIp — ql? +2C2(M)\lp -qll + CSQUD Ip — all. 
We can take oy (r) = C?(M)r? + (2C2(M) + C3(M))r. O 
Now, we prove Theorem 5.2 in several steps. 
We endow the product space E x E with inner product 
(x, y), €, m) = (x, E) + (y, n). 
Denote 
C(x, y) = (Cx, Cy). 
By the m-dissipativity of C, C induces a semigroup: 
SAE, y) = (Sx, Sy), | Gy) €ExE. 
For A > 0, (x, y), (p,q) € E x E, we define 


Gi((x, y (p,q) & AG (x. P) — G(y, -9), 


and define a single-valued operator Ho; C C(E x E) x M(E x E) next. Let 
D = D(HA2,) consist of functions ® defined as follows: 


(59) $Q,y- 7 li — yl? + 7 (p(x -eülyl), — uy» 0, 


where o € C!([0, 00)), o’ > 0, lim, +400 g(r) = +00; and 
P(x, y) = P(x, y) + o(dc(x. xo) +dc(y, yo), P > 0,xo, yo € E. 
We define, for each ®, 
Hj o (x, y) = DE (x, y) 
+ sup Gi((x,y); DéG y) + (p,q). 
Ipli?-- lig 2 20? 


We note that H2, (x, y) may be an unbounded function on E x E. 
We have a perturbation result. 


(5.10) 


LARGE DEVIATIONS IN HILBERT SPACE 363 


LEMMA 5.8. Letu € B(E x E) be upper semicontinuous, Po(x, y) € D and 
(£, 9) € E x E satisfy (u — $9)(X, y) = supy, yegxg( — Po) (x, y). Suppose 
K > 0; we define 


(5.11) b, (x, y) = do(x, y) + k(dc(x, €) + dc, 9). 
Then ®, has the following properties: 
(a) 
(u — Dy)(X, J) > (u — Px )(xX,y), — GL X G,). 
(b) For any ((xs, Yn) C E x E satisfying 


lim (u — Dy)(X%n, yn) = sup (u— x)(x, y), 
noto DM Tm i ? 


we have (Xn, Yn) > (X, 9) and u(xs, yn) > U(X, Y). 
PROOF. The same arguments as in the proof of Lemma 3.6 apply here. O 


LEMMA 5.9 (Doubling lemma). Let à > 0 and Condition 5.5 be satisfied. 
Define u(x, y) = A f(x) — f y) and v(x, y) = Ah(x) — h(y), where the f and f 
are the ones given in Theorem 5.2. i 

Then u is a viscosity subsolution (in the sense of Definition 1.16) of 


(I —aH»)u =v. 


PROOF. Let ọ € D and (X, y) € E x E satisfy 


(u — $9)(£, 9) = sup (u — Oo), y). 
x,yeE 


We want to show that 
(5.12) o^ (u$, $) — vC, )) < Gba e9*G, 5). 
We may assume Po takes the following form: 
Po(x, y) = $ (x, y) + p(dc(x. xo) + dc (Y, yo)) 
for some xo, yo € E, where 
pæ, y) = 7 Ix — yl?) + Zol + gy) 
takes the form in (5.9). Fix x > 0; we let 


P (x, y) = olx, y) + «(de (x, X) + dely, J). 


364 J. FENG 
By Lemma 5.8, 


(u — $)$, $) > (u — $), y) | VG,y) x G, 3). 
We define 


V (x, y) — u(x, y) — (x, y), 
Welx, y, E n) = u(x, y) = DEn) — 5- x — EI? y — nl?) 
and 
Pes, y, E, n) — u(x, y) — OE, m) 
- (x - EI + lly — nll?) — 8(1C EN + Cnt), 


where ||C°£|| = +00, if £ € D(C). 
We write the maximum of each of these functions: 


M = sup V(x, y) = V(X,$), 


x,yeE 

Me-— sup YC, y, E, n), 
x,y,S,geE 

Mss = sup We s(x, y, E, 7). 
x,y,S,geE 


It follows that M x Ms, M; > Mes and M; | M, as e 4 O and Mas t M; 
as ô | 0. See, for example, page 83 of [8]. The definitions of Me, Ms in [8] 
are slightly different than here, in the sense that suprema are taken locally for 
a ball of size 2r with arbitrary r > 0, instead of over the whole space. Note 
that sup, , |u(x, y)| < +00; note also that the form of $ in (5.9) implies that 
limjxy.jy]—-roo P(x, y) = +00, hence the suprema over the whole space are 
equal to the corresponding suprema over a sufficiently large open ball. Therefore, 
the same proof in [8] still works here. 


(1) For each £, 0 > 0, we can choose à = ô(e, 0) and (%¢,6, ye,0, 52,6, Neg) such 
that (6,0) | 0 as £, 0 | 0 and 


(5.13) Me — 0 < WV..5(e,0)(Xe,0. Ye,0» 6,65 Ne,0)- 


By Lemma 5.3, we can always select the (%¢,6, ye,0, 55,0, e,o) so that (note 
that ||C?x|| as a function in x is lower semicontinuous on E; see, e.g., 
Lemma 2.18 in [26]) 


We 5(6,0) (x, y £, n) 
(5.14) — &(dc (x, xe,0) + dc (Y, ye,o) + dc (&,£s,0) + dc (n, ne,o)) 
X Ve 5(e,0) (Xe,0. Ye,05 Se,0> Ne,0) Vx,y,S5,ne E. 


LARGE DEVIATIONS IN HILBERT SPACE 365 


Let x = Xg,0, y = Yeo; then 


1 
® (Eso, 16,0) + 7 - (Ite, o — eo II^ + Ive, — ne, ll^ 


+ (6,0) (1C? Ecol] + IC na, Il) 
(5.15) | 


< DE, + 5- (o — EN ee — IP) 
+ 8(€, 0) (C9 € |l + C9 nl) + (dc, £s) + dc C, ne,9)). 
From V, > WV, 5(;,9) and (5.13), 
Pel, y, 6, n) < We(Xe,0. Ye,0, 5e,0. 15,0) +8. 


Take x = xz,0, y = ye,o, therefore 
1 - duo EN 
9 95,0) + 7-Ixeo — £I + Dye — all”) 
(5.16) > P (Eso, Neo) 
i Nw N 2 
Eos (Ixe,o — &c,oll^ + llye,o — ne,oll^). ENEE. 
(2) It follows from (5.13) that lims—>0-++,9->0+ Ve,5(e,0) (Xc,0. Ye,0) = M, hence 
(Xe,0, Ys); (E50, N,6) T (x, y), 


Fed > FE), feo) —> f) ase,d)0 


(e.g., Step 3 on page 84 of.[8] and Lemma 5.8). 
Take £ = $2,9, N = Ne,9, Y = Yeo in (5.14): 


(5.17) 


a Í 
x— Af(x)— 5, * — Eeg | — edc (x, Xe,9) 


has a maximum at xz e. Since f is a subsolution of (5.2) in the sense of Defi- 
nition 1.16, | 


Ti 





—h Í 
(xs) £ MIL — Esg, CÜEs o) +E 


o 
1 M 
+ sup AG (sso, (ELS *p)). 
ple À E 


similarly, noting ys, is a maximum of 


I 
y —f0) — 5-lly — eol? — ede(, ys), 


366 J. FENG 


by the supersolution property of f, 


f «Rh 1 
=a (Veg) = =r Ye — Ne,6; C neo) =E 


+ inf G(».o, -2 +4), 
lg xe E 





Therefore 
aT! (u — v)(Xe,0, Ye,0) 
1 
< (“Gee — 6,9, Ye,6 — Ne,0), (Cb, C ?neg)) + 28 


(5.18) j 
+ sup (26(xs9, SQ T p)) 


lpl? +a]? x26? 
,9 T te,0 
memes G(ye0, ae + a)). 


(3) Apply Lemma A.8 of [8] to (5.16), and take 8 = &?. We denote 


Xe = X, g3; Je = Ye,e3: Ee = P Ne = Ne,g3- 


(a) After some algebra (for details, see page 86 of [8]), 
sp. 














(5.19) lim sup 
£04 


(b) Take 6 = £3, = S(h)&, n = S(h)ne in (5.15); noting IC9S(& | < 
CE, || and ||C°S(A) nell I C97. |l. 


1 1 
5; ; (les Ye) — Eos mI — Ies Ye) — 600 Ge. n2) 


a | ee 


< 
1 
+ ez (dc (S(h)&, Ee) + dc (S(h)ne, ne)). 
Send h — 0+; by (3.2) and (3.4), 
[Ss Ye) (5e, Ne) (CE, Cn.) 


« lim sup Q(S(h)&s, S(h)ng) — PEs, ne) TS 
h—04- h 


Kd 


LARGE DEVIATIONS IN HILBERT SPACE 367 


Hence by (5.17), 
(Se ye) — (Ee, Ne) 


(CEs, c°ne)} 
€ 


< | D(S(h)§Ee, SCA) ne) — (Bs, ne) 
< limsup —————————— ——————- 
h—0-4-,£—04- h 
S h 3 h = o $ 
<3 imap O(S(h)x, S(h)y) — OG, y) 
h04-, (5,3) (5,5) | Rh 
< DEDE, f) < De Polk, $) + 2x. 
Finally, apply (5.19) and (5.20) to (5.18); noting (5.17), 


u—v 


lim 
£—04- 


(5.20) 


($. $) < DED, f) + 2« 





(5.21) 


+ sup G1(&, $); Dé(&, 9) + (p. q)). 
lip12--lg I 20+)? 


Send « — 0; 


U a a a A 
(£, 9) < (Ho, 80)", 5). o 





c 
We discuss a property of Dé. 
LEMMA 5.10. Let B(x, y) € D be such that 
D(x, y) = G(x, y) -6(dc(x. xo) - dc(y, yo), xo, yo € E,0 » 0, 
where 
$6.3) = lx — yl? + EID + edi» iP). 


u > 0, 9 € C! ([0, 00)), g' = 0. 


Then 
(5.22) Dé (x, y) <0 
and 
(5.23) De o (x, y) x 26. 


PROOF. Equation (5.22) follows because S(t) is a contraction semigroup with 
the property ||S(¢)x|| < lix]: 


o(S(h)x, S(h)y)<¢(,y) —Vx,yeE,h» 0. 


368 J. FENG 


For each x, xo € E, by the definition of dc, there exists tọ > 0 such that 
dc(x, xo) +h = infr + lx — S(Oxoll) +h = to + lx — S(to)xoll +h 


> to h + [[SQnx — S(to + h)xoll = de(S(h)x, xo) Vh O0. 
Hence (5.23) follows. O 


PROOF OF THEOREM 5.2. We assume (f — f)(zo) = 489 > 0 for some 
zo € E (otherwise, there is nothing to prove). We want to create a contradiction. 

Let 1 <A < 1 4-09/(1 + sup, | f (x). We recall that u(x, y) =A f (x) — f(y). 
Then u(zo, zo) > 359 > 0. Let q(r) be given by Condition 5.5(2); we define — 


$6.) = Fle — yl? zer + eb I). 
Hence for 0 < e < (89)/(1 + glz). 


(5.24) 0<2d0 < u(zo, zo) — do < (u — $)(zo. zo) < VL — $)(x, y). 
x,ye 


By Lemma 5.3 on perturbed optimization, for each 0 > 0, there exist x, 66, 
Yu,e,9 € E such that 


(u — 9)(x, y) — 0(dc(x, Xu,e,0) t- dcCy, Yu,e,0)) < (u — $)(Xy,e,0, Yu,e,0) 


and 


(5.25) TE — $)(x, y) < (u — 9)(xyp,e0; Yu,e,0) +8. 
x,ye 


Denote 

P(x, y) = p(x, y) T (dc (x, Xy,6,0) "E dc(y, Yu,e,0)). 
therefore u(x, y) — o (x, y) attains its maximum at (x,,e,0, Yu,e,9)- From (5.24) 
and (5.25), 


u E 
o < ôo + 5 lixy,0.0 — Xue l + z (Olla, I) + ellus I) 
(5.26) ie 
S u(Xy,e,0; Yu,e,o) € ^sup| f (x)] + sup] f (x)] « oo 
X X 


for every 0 < do, |A — 1| < óo(1 + sup, Ifo)p71. Equation (5.26) implies the 
existence of constants Cu, M. such that 


MdlXu,e,9 — Yu,eel € Cu < oo Ve -0,0«0 < do, 


y Xp e,6 Ji? + ll ¥u,¢,0 li? < Me «oo V1, 0,0«0 < dp. 


In addition, since 


Co = sup re (r^)] + Ire (r)| < oo, 
r>0 


LARGE DEVIATIONS IN HILBERT SPACE 369 


there exists constant N;,,,, 
Ixu oll + iyus oll + Mlxu,ee — Yu.sol 
+ eg" (lku, e, lI xe, + 8€ Ile. o I MY e,o ll < Neu < oo, 
for 0 < 0 < do. Since 
DE (X, e,0. Yu,c,0) 
= (w(Xp,6,6 — Yu,e,0) + 9 (lXu,e,0 lI") Xp,6,0> 
— W(X p,6,0 — Yu,e,0) + EP' (ll¥u,c,6 ll") ¥u.0,8): 


and u is a viscosity subsolution of (J — aH>,,)u = v (Lemma 5.9), by the es- 
timate in (5.23) and (5.26), for 0 < àg and |A — 1| x do(1 + 2sup, |h(x)] V 
2sup, |f (D, 


ô 
2 — (Russe) — hys.) 


P ae 
< U(Xp,e,0. Ypu,e,6) e Oh (Xp,6,0) = h(yp,¢,6)) 


< 20 + sup Gy ((%p,6,65 Yu,e,0)3 
lp i7-+Il¢I?7<26? 


Dó(X,,e,0; Yu,e,) + Cp, q)) 


(5.27) <20 + sup Gl(xu, ese; Yu,e,0): 
lp1?-- lg 1? «20? 


((%p,6,0 — Yu,e,0) 
+ eg (llxy, e ol xps, + P, 
— A Xy,e,0 — Yyu,e,0) 
t eg (Ipso ll u.s +9)}- 


We select Ao, A1 satisfying 1 < Ag < Ay <A, and let them be fixed. By Condi- 
tion 5.5, for || pli < 1, 


B(Xy,s,9 — Yu,s0) + AETR lois + P) 
À 


M(Xp,6,0 — Yu,e0) + EQ’ (lixue, Il xus, EH 3 
À 


L(Xp,e,6 = Yu,e,0) = EQ’ (IX, e, o [nad 
À 


MX, e,0 did Yp,e,0) T EQ" [Xy e,6 east) 
À 


AG (xus $ 
= AG (seus $ 
T AG (xus 3 


+ AG (sues 5 


370 J. FENG 


(x , ,0 d ,8 X — 
= 1G (x00. Peed eet) +A1G (zns Beo — Diei) €^ we 
l 


ipil € 
Aun X ) nn. C,/A1 (=) 


L(Xy,6,6 is mee) 


AiG : 
TA (nss X 


Similarly, for ||qg|| < 1 and 0 € & <1, 


G(Yy,e, 0. A Xp,e,0 — Yue) — £0 (MYu e0 I") Yus — q) 
= G(yy,2,05 L(Xy,e,6 = Yu,e,0) m Ep (l¥u,6, 7) yu,6,) eS ON, +1 (lg ID 


L(Xp,2,0 ai: en) 


Xo — Vag, Cyt+Co(&) — On, +1 lla 1D. 


= AoG (ons ? 
Therefore 


G4 (Xy, s,6. Yp,e,0)3 (I pe, = Yu,e,0) T £Q' (lxy, s, lx, eso +p, 
ES L(Xp,6,6 T5 Yu,e,0) 2 £9 (Ip, e, o I, S q)} 


M(Xy,e,8 — Yu,e,9) M(Xy,e,0 — Yu,e,0) 
- 1G (žm E) B MG (yus, pened ee 


lpi i 
+ AGN, 41 + ALYA/A4,Cu/A1 P 


T Vao,CutCo n T ON, u+! (Il¢ I) 


u lpi 
< À0011/Ao (Irma — Yu,e,ell + Ac Teee ~ Xueel ?) MAS "s À ) 


Ca ce =) dioses) Pons a 


MIN. SRM 
We rewrite (5.27) next. For 0 «< ôo < land1<A<1+ IW 


ó 
5 — (A(Xp,0,0) — hys?) 


Il pl 


^ ) a ow, il) 


< 20 + sup (row, ati (= 
pl? +11¢ 1? «20? 


(5.28) 


€ 
ELO 


PAL CX: (= 


7 
T dO Pr /Ao (Irme — Yu,£,0 | + Ag ene — Yu,e,0 P). 


LARGE DEVIATIONS IN HILBERT SPACE 371 
Let 


m 
my sup (we. -Ëk 19) 
x,yeE 


From (5.25), 


ccn H | 
uce,y) — 5A — yl? < timing (uGuo Yme) — Elme — eal!) 


H 2 
< limsup (uamsa, Yu,e,0) — zx Xu,e,0 — Yu,s,0 | ) 
£—0,8—0 2 


<Myp, x, ye. 


Hence 


, H 2 
m,- lim (unsa, Yue.) — 7 use e — Yu,e,6 | ). 


iu £—0,0—0 
Appling Lemma 3.2 in [8], 


lim, limsup pl|Xp,¢,6 — Yu, |” — 0. 
i zs pe, 
H> 10.00 


In (5.28), let 0 | 0, then e | 0, then u t +00; we obtain 
0 < 59/2 <0. 
A contradiction. |] 


APPENDIX 


A.1. Verifying semigroup generation condition. 


A.1.1. The Cahn-Hilliard equation. Let c = sup, |V"(r)|?/4 and let C, C, be 
defined according to (4.4) and (1.23). We prove the following. 


LEMMA A.1. The closure of C (resp. Cn) is an m-dissipative operator 
in L^(0). 


The proof is divided into two parts. 
LEMMA A.2. Both C and C, are dissipative. 


PROOF. Letx,y € D(C); then 
(Cx — Cy, x — y) 
—-(A(-AG — y). x — y) (Vx) — V'O), A& — »)) — olx — yf 
< —-lAG — I + sup |V"(rIllx — yl AG — yD 


sup, IV"(r)? 


zoe» SO. 


372 J. FENG 


The case of C, can be treated similarly. L] 


LEMMA AÀ.3. Let 0<a@ x ao where ag is some prefixed small number. Sup- 
pose yo € L^((9). Then for each y, = P, yo, there exists x, € &(P4) such that 


(A.1) (I — aCy)Xn = yn. 

Moreover, 

(A.2) xal < lyn] and [Csxall x (2/a) lyn ll, 
(A.3) |Axnl] x cC ++ Hb yn. 


where c is a constant depending on F and a. Consequently, there exists xo € D(C) 
such that 


(I — aC)xo = yo. 


PROOF. For each n fixed and finite, by its definition (1.23), C, is Lipschitz on 
Range(P,,). By the fixed point theorem, the essentially finite-dimensional equa- 
tion (A.1) has a solution when o > 0 is sufficiently small. Since C, is dissipative 
on Range( P,), solution actually exists for all a > 0. See Lemma 2.13 of [26]. 

xn |] < llynl] follows from the dissipativity of C,. It follows then that 
a I CnXn |] = [len — ynll x Ziyan ll- (Xn — @CnXn, Xn) = (Yn, Xn). That is, 


læn ll? — e (— Axe ll? + V Gn), Axn) — @llxnll?) = (yn Xn), 
which implies 
o I Ax; ll? + ellen ll?) x yall nll — Han ll? + or V^ Gn) HE As T. 

Since V'(r) grows at most linearly, we can find constant c > 0, 

ll V^ Gall < e + lixa). 
Hence 

ol Axl? < Lyn lanl] + erc. + [lon ff) | Axn Hl 
< Lyall? +c + [yn ID IL Axes M 


implying (A.3). 
For each x € H*(Q), || Ax|| < co and 


(A.4) AV'(x) = V"(x)Ax + V" (x)VxVx. 


We note that the Sobolev embedding H 1(9) — L*(Q) holds for space dimen- 
sions d == 1,2,3. Such result can be found in [1]: the case of d = 3 follows from 


LARGE DEVIATIONS IN HILBERT SPACE 373 


Lemma 5.10, the case of d = 2 from Corollary 5.13 and the case of d = 1 follows 
from Corollary 5.16 of [1]. Therefore, 


IAV Gn) < sup |V” GHI Axnll za) + sup V" (IV xn 74,0) 
r r 
= C(1 + [| Ax, liaa) 


for some constant C independent of the x,'s. By (A.3), sup, || AV’(xn)|] < co. 
Using this estimate and sup, ||C,x; || < oo, we obtain sup, || A?x; || < co. 

The boundedness of sup, (lA x,]] + | A?x,]l) implies that Ax, is relatively 
compact in L^((9). Similarly, the boundedness of || Ax,|| and ||x,|| implies the 
relative compactness of Vx, and x,. Selecting a subsequence if necessary, we 
have x, — xo. Ax, — Axo and Vx, — Vxo for some xp € H 2((9). By (A.4), 
AV’ (xn) — AV'(xo). Therefore 


lCnxs — CXn || = (Pn — DAV’ (an) || — 0. 
Noting 
OQ Cx, = ACyXn + (Cx, — CnXn) = Xn — Yn + A(CXy — CnXn) — X0 — Yo, 


(xo, a! (xo — yo)) € C. By (A.1), 


Xn — " 
yo = Xo 29:909 € (I — a C)xo. q 
a 


A.1.2. The quasilinear equation with viscosity. We consider the C,,C in 
(1.27) and (4.5). Using similar a priori estimate arguments as in the Cahn—Hilliard 
equation case, we can prove the following: 


LEMMA A.4. C, and the closure of C are both m-dissipative operators 
in L*(Q). 


A.1.3. The Allen—Cahn equation. Let C, and C be defined according to (1.18) 
and (4.1). Again, using a priori estimate arguments similar to the Cahn—Hilliard 
case, we can prove that both C, and C are m-dissipative operators in L*(Q). Al- 
ternatively, this conclusion can also be established by invoking the classical per- 
turbation theory in, for instance, Corollary 6.19() of [26]. 


À.2. Exponential compact containment estimates. We illustrate the use of 
a stochastic Lyapunov function technique to verify Condition 1.12. We consider 
Examples 1.2, 1.5 and 1.8. 


374 J. FENG 


A.2.1. Stochastic Allen-Cahn equation. Recall (1.17) in Example 1.2: 


(A.5) dX,(t) = AP, Xn(t) dt — Pa V’ (Pa Xn) dt + Bal Xnlt) dW (t). 


The goal of this subsection is to prove the following lemma. 
LEMMA A.5. Condition 1.12 holds for Xn. 


We introduce the free energy function 
E(x) = IVx]? + Í, V (x) dé. 


First, we prove the following estimate: for every T, a > 0 and Co > 0, there exists 
Cı > 0 such that 


(A.6) sup P(&(X,(0) > Ci, some0 <t x T|X,(0) =x) xe "^, 
x:6(x)xCp 


Let us approximate 6 by 
(A.7) E(x) = -i(AP,x, x) «f V(P,x(0)) dé. 


Note that if X,(0) € (P4), the range of Pa, then X,(t) € (P4), hence 
E,(Xn(t)) = € (Xn (t)). Define 


J 
(A.8) fal) e log(1+ 78, e) 
where M = sup, , lo (0, x)| < oo. Then 


(—1/M?)(A Pax — Py V'(P,x)) 


Dha) 5: i. 0/M58,6) 
and 
D? f, x) = CUMA Fux — Pa V' Paa) Q (1/MP)(A Fax — Pa V'Q'a) 
i (1+ (1/M?)E,(x))? 
(1/M?)(—A P, + PV" (P,x)) 
1 + (1/M?)6, (x) 


where P, V"(P,x) means a linear operator on L?(0) [for each x € L?(0) fixed]: 


(Pa V” (Pax))y = > (V"(Pax)y, ever — Vy e L'(0). 
kzx(k,,..., kq)z-(1,...,1) 


LARGE DEVIATIONS IN HILBERT SPACE 375 


Then 
H, fa (x) = (AP, x — P, V'(P,x), D falx) 
1 1 
+ 518; 62 Dfa GO + zr Tr(D? fi (x) Br (x) Br ()) 
p lA Pax — P, E e 
1 + (1/M2)6, (x) 


1 ( B D) (1/M^) B; (A Pax — Pr Vy (Pax) 


t3 1+ (1/426, (x) 





| 
(Mns. mg) | 
*l x (C AP, PV (S3) 7 1 a ^a) 








2n k=(1,...,1) 


x (1 + aeo) 


_ 1/M*)|A Pax — Pa V' (Pax)? 
i 1+ (0/M?)6, (x) 
«zn --\| ee 
2\ n 1+ (0/M?)6, (x) 


] x07 c OB de ide ma A 1) Ai ({(Bu(x)/M Der, ej) 


gereg 


2n 1+ (1/M2)&, (x) 


i: 1 (mg, ..., ma) (ng, mg) Mys., No x) 
2n M? 








(A.9) 


ck. €;)(V"(Pax)ej e 


s] 


kallas RI pP EI PL 





x (1 + eo) 


| ] 1 1 ] á 
dea * a 
1+(1/M)E,(x) 2 n/\1+ (0/M?)6&s (x) 

| A Pax — Pra V’ (Pax)? 

a) om 

(mg ,..., mg) 
m Y mi f, IV"(P,x(0))| d0 

i1... 1) 


4d 24 (1 + m, d 3d 
<O+ RR + -A- sup |V"(r)| < Constant < oo, 
r 


g 
2n 


376 J. FENG 


where the A;'s are eigenvalues defined in (1.12) and the constant is independent 
of n. In the last inequality above, we used (1.14), and the estimate that 


(mg, ..., mg) Mn d 3d 
1 
(A.10) >» AS ( X ua) sad M E 
i—(1,...,1) 


i11 
where 44; is the one in (1.10). 
Let 


t, = inf(t > 0: 6,(X4(t)) > C1). 
By optional sampling theorem 


sup P(&,(X,(t)) > C1, some 0 <t x T|X,(0) =x) 
x: 8(x)<Co 
x e" € —Co)-nT sup, x An f(x) 


(A.11) 
< sup Efe OG (T^) -nfa Qt (0))— f ^ nBn fa Qn) ds | yr. (0) = x] 
T x: 6(x)x Co 


ec. 
Hence (A.6) follows. We now relax the initial condition in the estimate to that in 
Lemma A.5/Condition 1.12. We achieve this by the following result: 


LEMMA A.6. We denote by X} the solution of (A.5) with initial value 
X54(0) = x. Then for each T,a > 0, there exists a constant C = C(T,a) > 0 such 
that 


P( sup [X30 — X2(0l > Cel ll — yl < e) wee 
O<t<T 


VO<e<la=—1,2,.... 


PROOF. For each n fixed, (X7(¢), X ?(t))isa two-component Markov process 
that solves the martingale problem with generator 
An f (x, y) = (A Pax — Pa V' (Pax), Dx f (x, y)) 


+ (A Pay — Pa V' (Pay), Dy f (x, y)) 
+ 2 Tr((D2, f) B, GO) BEQO 
2n 
+ (D2, f) B.) Br) + AD? f) Bu (0) B y) 


for f(x, y) e C?(L?(0) x L?(0)). Let e > 0 and 
) 


Pax — Pay 
E 














1 
Ín,cQX, y= log(1 4- ; 


LARGE DEVIATIONS IN HILBERT SPACE 377 


It follows that 
1 
Hn fa e (x, y) = me Mine Ane (x, y) € Co < oo, 


where constant Co is independent of n as well as e. By an argument identical to 
that used in the proof of (4.11), the conclusion follows. 1] 


PROOF OF LEMMA A.5. Let compact set K C L*(Q) and a, T, £ > 0. It is 
enough for us to show that for any x, € K, there exists compact set Ki C E, 


P(3t e[0, T], X7" (t) € Kj eun, 
Let ô > 0. By compactness of K, there exists [^T Viu Xn mo such that 


m(6) 


K c | BG, ô). 
k=] 


Since {x : €(x) < +00} C H! (O) is dense in L^(0), we can select xj so that 


sup & (xd) < Foo. 
ô) 


Therefore, we can choose 


efx, xO 


X0,n € 1X9. 5 X9 


such that |[xo,» — Xn || < 6. By Lemma A.6, there exists C = C(T, a) > 0 (indepen- 
dent of 8) such that 


P( sup [|Xn°"(t) — X» (t)|| > ci) «gd 


O<1<T 


By (A.6) and the compactness of level sets for €, there exists a compact set Kı C E 
such that 


P(X” (t) € KÈ, 3t € [0, T]) xe "^. 
It follows that 
[3t e [0, T], X^ @) € KETO) 
C (3t € [0, T], Xn") e K))U om, | X5? (t) — X? O > ci]. 
Therefore 
P(3t e[0, T], X^ @) e KOTO) 
< P(3te[0, T], X" (t) ¢ KÎ) + P( sup. |Xz^^ (t) — X; (0 > ca) 


«2e 4. 


Taking à = e/(1 + C), we complete the proof. L 


378 J. FENG 


A.2.2. Stochastic Cahn-Hilliard equation. Recall the stochastic Cahn- 
Hilliard equation (1.22) in Example 1.5: 


(4.12) dX5(t) = AP. (—APp Xn (t) + Pa V (P, Xp (t))) dt + Ah dW (t). 


We prove the following lemma. 
LEMMA A.7. Condition 1.12 holds for Xj. 
Using identical arguments as in the Allen-Cahn case, we just need the following 


estimates: sup, sup, Hn f(x) < oo for the f, below, and (A.15). 
Define €, 6, the same way as in (À.7). Let 


1 
ho =log(1+ 56,02). 
where M > 0 is the constant in the Poincaré type inequality 


(A.13) 








x—- l) x(0) dé 
0c0 
Let A, be defined according to (1.12). Then 
Hn fa (x) = (A Ph (—A Pax + Pa V'(Pnx)), Dfn(x)) 





| «M|iVxi — Vxe H?(0). 


] 2 I 2 
+ Dfa? + = TED? f Qo) 


|. (-1/MP)[C- AP.) (A Pax — Pa V’ (Pax)? 


1 + (1/M?)&, (x) 
m ( zd lA Pax — Pa V’ (Pax) |] / M" 
2 n (1 + (1/M2)&, (x)? 
_ 1am?) ye) (— A Py + Pa V” (Pax))ek, ex) 
2n 1+ (1/M?)&, (x) 
_ (=1/M?)|| V(A Pax — Pa V’ (Pax)? 
7 1 + (1/M?)&, (x) 
n. ( zr ) (/M?)IA Pax — Pa V'(Pax)l[?/M? 
2\ n (1 + (1/M?) En (x)? 
d 1 Cea) Ak + ee) Quer er) 
M? 2n 1 + (1/M2)&, (x) 


A.14 < r-i) (rmm) | 
(A.14) * za 2\ nJ MN 4+ G/M) En) 


LARGE DEVIATIONS IN HILBERT SPACE 379 


I V(A Pax — P. V’ (Pax)? 
x a) MEN 


1/1 a 
+5 (sa hY ta) 
1 1 (Mn, ..., my) " 
25 MÀ D (m + suplV ol) 


kzx(1,..., 1) 


+ 


« C « oo. 
In the above derivations, we used (A.13): 


lA Pax — P, V (Pao 


2 
E | f, (A Pox (8) — P. V'(Pax(0))) do} 


2 
+ [Ans == P,V'(P,x) m J (556) — P, V'(P,x(0))) do 








2 i 
< (f V'(P,x(0)) a6) + M?|V(AP,x — P,V"(Pnx)) e 
9 
We also made use of (A.10) and condition (1.21). 
LEMMA A.8. We denote by X} the solution of (A.12) with initial value 


Xn (0) = x. Then for each T,a > 0, there exists a constant C = C(T,a) > 0 such 
that 


P( sup |[Xz(t) — X2(t)| > Ce|Ilx — yl < e) «e 
O«txT 


(A.15) 
VO0-s-cl,ncl2,... 


PROOF. The proof follows the same idea as in Lemma A.6. O 


A.2.3. Stochastic quasilinear equation with viscosity. Using the same ideas as 
in the stochastic Allen-Cahn and Cahn-Hilliard case, by choosing 


E(x) ixl? + IVx), — 800 = EC Paxil? — (A Pax, x) 
and 
fn (x) = log(1 + 6, (x)), 


we can prove the following. 


LEMMA A.9. Condition 1.12 holds for the Xn in (1.26). 


380 J. FENG 


A.3. Approximations of the Tataru distance function. Let E, Up be real 
separable Hilbert spaces. We discuss approximations of the Tataru distance func- 
tion dc (Definition 1.13) by C?(£) functions. Throughout this section, we assume 
Condition 1.11 is satisfied for C, Cn. 

We want to keep two useful properties (3.3) and (3.4) in the approxima- 
tion. The functions he, y(x) and An,s,y(x) defined in (3.8) and (3.9) satisfy these 
requirements—see (A.18) and (A.20). 


LEMMA A.10. For each £ > 0 small enough, define ġ according to (3.7): 
e(r) — Ar when r > £ and 


| (rey 
LEO 8E /6 





when 0 <r «€ €. 





e(r) = 





Then: 


(1) 3,6; € Ch (I0, +00)); de is E, sup, rlo; (r)| < +00. 
(2) lim, 0 Sup,..o ls (r) — Jr| = 
(3) 


(A.16) 0<rg,(r*) x 1/2. 
PROOF. The first two properties follow from direct verification. To see that the 


third one holds, let f(r) = r$ (r?); then f'(r) > 0 for0 <r «e, and f’(r) 20 
when r > e. Hence r$? (r?) < £^ (8?) = 1/2. O 


LEMMA A.11. Let @, qq:[0, oo) — [0, co) be continuous. Suppose there ex- 
ist 0 «m « M <-+00, 0 <c < © such that mt < gu) < c+ Mt, n —1,2,.... 
Suppose further that 0 « a, — +00 and that 


nS SUP lon) —e()) 2-0 — VTz0. 
Then 


lim —— -iog f enma dt — mee): 


noo dn 
PROOF. Itis straightforward to see that, when T > 0 is large enough but fixed, 


lim -—0¢ f enn) qu = i ot) = inf p(t). 


h-00 gq 


Take T > (0) ii m; then when n is large enough 


1 oo 
= log J ean Pa) dt > —— log | eg 91" dt > mT > g(0) > inf p(t). 
An An T t=0 


LARGE DEVIATIONS IN HILBERT SPACE 381 


Therefore, 
] T 1 oo 
inf g(t) = min| lim inf — — log i e 9s) dr lim inf —— p e [ ean ba) dt} 
t>0 noo dn noo 
1 
< lim inf —— log [. e nf dt < lim sup —— Ww, ean Pa gr 
N--* OO ay 0 n— oo 


1 T 
< lim sup —— log |. ean bal) dt = inf g(t). = 


n-o0 n 


LEMMA A.12. Leta, > 0 be such that liMn— o dy, = oo. Define Qe as in (3.7) 
and hs, hn s according to (3.8) and (3.9): 


lin c (X) = 1 log | B g nV e Ix 55 (01^) ay 
an 0 i 
and 
he(x) = inf {t + $e (lx — S@)yIP)}. 


Then: 


(1) Anex) > c whenever |xl| > llyll +c, for every c >1,ap> 1. 
(2) For each y € E fixed, 


(A.17) lim sup |he(x) — dc(x, y)| = 0. 
£—0-c-xcE 


(3) For each & > 0 fixed, 


lim sup |he(x) — hn (x)| 20 V compact K C E. 
n—00xcK 


(4) hn, € C?(E); 
(A.18) Dhn, œ) x 1. 
If B, is a Hilbert-Schmidt operator from Uo to E [i.e., Bn € L3(Ug, E)], then 


| Tr[D^h, e (x) B, BEI 
(A.19) 


(2, Tésupulos '(r)| + ep updi) ll B». ll n, E) 


(5) Leto ET; we have 


(A.20) limsup Ui, e (Sn (7)2) 7 Pn, 2) < sup y’(r) Vx ec E. 
r>0 


r->0+,z—>x r 


382 J. FENG 
If B, € L2(Uo, E), then 
| TrLD* (9 o hn,e) (x) Bn B*]| 


(A.21) <{supe” (s) + 2 “(s) (22, + e (r)| + pee A e) | 


X [Bs li L2(Uo, E) 


PROOF. Part (1): Since ||S,(¢)yll < lyi, €llxll — lly) v 0 < lx — Ss(Oyll. 
Hence 


$e (lx ]l — liy ll) v 0)^) < has (x) 


when a, > 1. Noting ¢, (r2) =r when r > 1, the conclusion follows. 
Part (2) is a direct consequence of part (2) of Lemma A.10. 
Part (3) follows if we prove that for each x, — x, 


im, An, in) = he Q9. 


Take gy (t) = (t + ós(llxs — AYI e = (t + pelle — S(y12)). For each 
T > 0, 


sup Pn) POl = sup [de (llan — S. (yl?) — pell — Syl?) 
O<1<T O<t<T 


<sup¢i(r) sup |an — S, (yl — lix — S(Oyll| > 0 
r>0 0<1<T 
as n — OO. 
In addition 
tx ot) xt sup de ((lxn ll + Hyh’). 


Apply Lemma A.11; therefore 


lim —— Sig I g nml) dt = inf Fo). 


n-—oo an 
Part (4): It can be verified that 


yon € ^ri SHON (Ix — Su OI?) — Suny) dt 


c E. 
fee e dnlt-$sCIlx—5, (012) dt 


Dhn, (x) = 


By Lemma A.10, 
IDA œ) < 1. 


ei 


LARGE DEVIATIONS IN HILBERT SPACE 383 


Direct calculation also gives 


1 


3 LLL i LLL 
D^hg (x) = i g nlt és CIlx—55()»12)) dt 


x f. g dnte lx S. Oy) 
0 


x (4an) (6; (Ix — Sn yl)” 
x (x — Sny) ® (x — S.()y) 
497 (Ix — S. (yl) (x — Sn(t)y) & (x — Sn (y) 
+ 24 (llx — 55 (0)»17)1) dt 
+ an Dhn (x) & Dhn, (x). 
Let (61, ..., 65, ...] be a complete orthonormal basis for E. Then, 
| Tr[D^h,, e (x) Bn Br] 


=|) Dh, (x) Bn Bren, £x) 
k à 


< sup (an (64 (Ix — Sn (y)? +4162 — $0» 12) 
x | B(x — SOY) lo, 
+ sup (læ — Sn OYI) Bull quo c + nll By Dhns) 
4a, sup(llx — Sn Oyl; (1x — SOIP) Brllliouo.E))” 
+ 4sup (lx — Sn (t)y II) lx — Ss)» 121Bs12., qu, zy 


+ 2sup 9 (lx — Su(OYIP)I Ball qu e an lin s COT MLB c.2 
t> 


< (4an sup(rag(r?))? + 4suprldZ (2| -2sup d£?) + an) TAAN 
r> r> r> 


To derive the second inequality above, we used the following: let e1, ..., ex, ... be 
a complete orthonormal system for Uo; then for each z € E, 


| Brzil* = Y (sz. ex) = Y (e, Buex)? < Y zl LBaecl 


k k k ' 


= zB qus, 2): 


384 J. FENG 


Noting (A.16), (A.19) holds. 
Part (5): Let r > 0; then 


hn g(x) --r = e log r g^ nere 1x5, Oy?) dt 
ay 0 


1 oo 2 
> —— log | e dn rt be CIL G7)x — n (t--r)yll^) dt 
a 0 


UNE log J Tg an be 15 (x5, OI) gy 
a r 


> An, e(Sn(r)x). 
Hence we have (A.20). Equation (A.21) follows from (A.18), (A.19) and 
D*o o hn (x)= p” o hs e (X)(Dh, e (x) e» Dh, «(x)) T p o hy s (x) D^ h, (x). 
[] 


Acknowledgments. I would like to thank Professor Tom Kurtz for useful 
discussions on stochastic equation in infinite dimensions, and Professor Markos 
Katsoulakis for useful discussions on stochastic Allen-Cahn and Cahn-Hilliard 
equations. An anonymous referee carefully read through the paper, and provided 
an extensive list of thoughtful comments and suggestions. These are incorporated 
in the paper; I gratefully acknowledge the help as well. 


REFERENCES 


[1] ADAMS, R. (1975). Sobolev Spaces. Academic Press, New York. MR450957 

[2] BARLES, G. and PERTHAME, B. (1988). Exit time problems in optimal control and vanishing 
viscosity solution method. SIAM J. Control Optim. 26 1133-1148. MR957658 

[3] BARLES, G. and PERTHAME, B. (1990). Comparison results in Dirichlet type first order 
Hamilton—Jacobi equations. Appl. Math. Optim. 21 21-44. MR1014943 

[4] CHOW, P.-L. (1992). Large deviation problem for some parabolic Itó equations. Comm. Pure 
Appl. Math. 45 97-120. MR1135925 

[5] CRANDALL, M. G., ISH, H. and LIONS, P.-L. (1992). User's guide to viscosity solutions 
of second order partial differential equations. Bull. Amer. Math. Soc. (N. S.) 27 1-67. 
MR1118699 

[6] CRANDALL, M. G. and LIGGETT, T. M. (1971). Generation of semigroups of nonlinear trans- 
fomations on general Banach spaces. Amer. J. Math. 93 265-298. MR287357 

[7] CRANDALL, M. G. and LIONS, P.-L. (1985). Hamilton-Jacobi equations in infinite dimen- 
sions, Part I. J. Funct. Anal. 62 379—396; Part IL ibid. 65 (1986) 368—405; Part HI. ibid. 
68 (1986) 214—247; Part IV. ibid. 90 (1990) 237—283; Part. V. ibid. 97 (1991) 417—465; 
Part VIL ibid. 125 (1994) 111—148. 

[8] CRANDALL, M. G. and Lions, P.-L. (1994). Hamilton-Jacobi equations in infinite dimen- 
sions, Part VI: Nonlinear A and Tataru's method refined. Lecture Notes in Pure and Appl. 
Math. 155 51—89. Dekker, New York. MR1254890 

[9] DA PRATO, G. and ZABCZYK, J. (1992). Stochastic eens in Infinite Dimensions, Cam- 

bridge Univ. Press. MR1207136 


aed 


LARGE DEVIATIONS IN HILBERT SPACE 385 


[10] DAWSON, D. and GARTNER, J. (1987). Large deviation from the McKean- Vlasov limit for 
weakly interacting diffusions. Stochastics 20 247-308. MR885876 

[11] DE ACOSTA, A. (1997). Exponential tightness and projective systems in large deviation theory. 
In Festschrift for Lucien Le Cam 143-156. Springer, New York. MR1462943 

[12] DEMBO, A. and ZEITOUNI, O. (1998). Large Deviations Techniques and Applications, 2nd ed. 
Springer, New York. MR1619036 

[13] DONSKER, M. D. and VARADHAN, S. R. S. (1975). Asymptotic evaluation of certain Markov 
process expectations for large time, I. Comm. Pure Appl. Math. 28 1—47; II ibid. 28 (1975) 
279—301; III ibid. 29 (1976) 389-461. 

[14] EKELAND, I. (1979). Nonconvex minimization prO onig: Bull. Amer. Math. Soc. (N, S.) 1 
443—474. MR526967 

[15] ETHIER, S. N. and KURTZ, T. G. (1986). Markov Processes. Wiley, New York. MR838085 

[16] EVANS, L. C. and Isurt, H. (1985). A PDE approach to some asymptotic problems concerning 
random differential equations with small noise intensities. Ann. Inst. H. Poincaré Anal. 
Non Linéaire 2 1-20. MR781589 

[17] FENG, J. (1999). Martingale problems for large deviations of Markov processes. Stoch. 
Process. Appl. 81 165—216. MR1694569 

[18] FENG, J. and KURTZ, T. G. (2003). Large deviations of stochastic processes. Preprint. 

[19] FLEMING, W. H. (1978). Exit probabilities and optimal stochastic control. Appl. Math. Optim. 
4 329—346. MR512217 

[20] FLEMING, W. H. and SONER, H. M. (1993). Controlled Markov Processes and Viscosity So- 
lutions. Springer, New York. 

[21] FLEMING, W. H. and SOUGANIDIS, P. E. (1986). Asymptotic series and the method of van- 

ishing viscosity. Indiana Univ. Math. J. 35 425-447. MR833404 

[22] FLEMING, W. H. and SOUGANIDIS, P. E. (1986). A PDE approach to asymptotic estimates 
for optimal exit probabilities. Ann. Scuola Norm. Sup. Pisa CI. Sci. (4) 13 171-192. 
MR876121 

[23] FREIDLIN; M. I. and WENTZELL, A. D. (1983). Random Perturbations of Dynamical Systems. 
Springer, New York. MR1652127 

[24] KALLIANPUR, G. and XIONG, J. (1996). Large deviations for a class of stochastic partial 
differential equations. Ann. Probab. 24 320—345. MR1387638 

[25] KURTZ, T. G. (1987). Martingale problems for controlled processes. Lecture Notes in Control 

and Inform. Sci. 91 75-90. Springer, Berlin. MR894107 

[26] MIYADERA, I. (1992): Nonlinear Semigroups. Amer. Math. Soc., Providence, RI. 

[27]. PESZAT, S. (1994). Large deviation principle for stochastic evolution equations. Probab. The- 
_ory Related Fields 98 113-136. MR1254827 

[28] PUHALSKII, A. (1991). On functional principle of large deviations. In New Trends in Proba- 
bility and Statistics 1 198—219. VSP, Utrecht. MR1200917 

[29] SOWERS, R. (1992). Large deviations for a reaction-diffusion equation with non-Gaussian 
perturbations. Ann. Probab. 20 504—537. MR1143433 

[30] SPOHN, H. (1991). Large Scale Dynamics of Interacting Particles. Springer, Berlin. 

[31] TATARU, D. (1992). Viscosity solutions of Hamilton-Jacobi equations with unbounded non- 
linear terms. J. Math. Anal. Appl. 163 345-392. MR1145836 

[32] TATARU, D. (1994). Viscosity solutions for Hamilton-Jacobi equations with unbounded non- 
linear term: A simplified approach. J. Differential Equations 111 123-146. MR1280618 


DEPARTMENT OF MATHEMATICS AND STATISTICS 
UNIVERSITY OF MASSACHUSETTS-AMHERST , 
AMHERST, MASSACHUSETTS 01003 

USA 

E-MAIL: feng? math.umass.edu. 


The Annals of Probability 

2006, Vol. 34, No. 1, 386-422 

DOT: 10.1214/009117905000000576 

© Institute of Mathematical Statistics, 2006 


FINITELY ADDITIVE BELIEFS AND UNIVERSAL TYPE SPACES! 


BY MARTIN MEIER 
Instituto de Análisis Económico, CSIC 


The probabilistic type spaces in the sense of Harsanyi [Management Sci. 
14 (1967/68) 159—182, 320—334, 486—502] are the prevalent models used to 
describe interactive uncertainty. In this paper we examine the existence of a 
universal type space when beliefs are described by finitely additive proba- 
bility measures. We find that in the category of all type spaces that satisfy 
certain measurability conditions («-measurability, for some fixed regular car- 
dinal «), there is a universal type space (1.e., a terminal object) to which every 
type space can be mapped in a unique beliefs-preserving way. However, by 
a probabilistic adaption of the elegant sober-drunk example of Heifetz and 
Samet [Games Econom. Behav. 22 (1998) 260—273] we show that if all sub- 
sets of the spaces are required to be measurable, then there is no universal 
type space. 


1. Introduction. Consider players that are uncertain about a set S, called the 
space of states of nature, each element of which can be thought of as a complete list 
of the players’ strategy sets and payoff functions, that is, a complete specification 
of the “rules” of the game that depend on the state of nature. (Other interpreta- 
tions are also possible. For example, if a game of complete information is given, 
a state s € S could be the strategy profile that the players are actually going to 
choose; see the analysis of epistemic conditions for Nash equilibrium by Aumann 
and Brandenburger [1].) In such a situation, following a Bayesian approach, each 
player will base his choice of a strategy on his subjective beliefs (i.e., a probability 
measure) on 5. Since a player's payoff depends also on the choices of the other 
players, and these are based on their beliefs as well, each player must also have 
beliefs on the other players’ beliefs on SS. For the same reason, he must also have 
beliefs on other players’ beliefs on his beliefs on S, beliefs on other players’ beliefs 
on his beliefs on their beliefs on S, and so on. So, in analyzing such a situation, it 


Received May 2002; revised May 2005. 
| Supported by the DFG (German Science Foundation) via the Graduiertenkolleg “Mathemati- 
Sche Wirtschaftsforschung" (University of Bielefeld), the European Union via the TMR-Network 
"Cooperation and Information" (University of Tel Aviv and Université de Caen) and a Marie 
Curie Individual Fellowship at CORE (Université Catholique de Louvain), and the Spanish Min- 
isterio de Educación y Ciencia via a Ramon y Cajal Fellowship (IAE-CSIC) and Research Grant 
SEJ2004-07861. 
AMS 2000 subject classifications. 91A40, 91A35, 28E. 
Key words and phrases. Finitely additive probability measures, x-measurability, Harsanyi type 
spaces, universal type space, games of incomplete information. 


386 


FINITELY ADDITIVE BELIEFS 387 


seems to be unavoidable to work with infinite hierarchies of beliefs. Thus, the re- 
sulting model is complicated and cumbersome to handle. In fact, this was the issue 
that prevented for a long time the analysis of games of incomplete information. 

A major breakthrough took place with three articles of Harsanyi [5], where he 
succeeded in finding another, more workable model to describe interactive uncer- 
tainty. He invented the notions of type and type space: With each point in a type 
space, called a state of the world, are associated a state of nature and, for each 
player, a probability measure on the type space itself (i.e., that player's type in this 
state of the world). Usually it is assumed that the players "know their own type," 
that is, a type of a player in a state assigns probability 1 to the set of those states 
where this player is of this type. This is the formalization of the idea that the play- 
ers should be self-conscious. Since each state of the world is associated with a state 
of nature, each player's type in a state of the world induces a probability measure 
on S. But also, since with each state of the world there is associated a type for 
each player (and hence indirectly a probability measure on S for this player), the 
type of a player in a state of the world induces a probability measure on the other 
players’ probability measures on S. Proceeding like this, one obtains in each state 
of the world a hierarchy of beliefs for each player, in the sense described above. 

The advantages of Harsanyi's model are obvious: Since we have in each state 
of the world just one probability measure for each player, contrary to the hierar- 
chical description of beliefs, this model fits in the classical Bayesian framework 
of describing beliefs by one probability measure, and provides therefore all its 
advantages (e.g., it allows for integration with respect to beliefs). 

However, there are also several serious questions that arise with the use of this 
model: Although each state of the world in a type space induces a hierarchy of be- 
liefs for each player, the converse is not obvious: Does each profile of hierarchies 
of beliefs arise from a state of the world in some type space, and if so, is there a 
type space such that every profile of belief hierarchies is generated by some state 
in this type space? What “are” the states of the world, and what justifies using 
one particular type space and not another? In particular, contrary to the case of the 
hierarchical description of beliefs, it is not clear what “all possible types" (resp. 
"all possible states of the world") are. More precisely, given a game of incomplete 
information, working in one fixed type space to analyze that game could be restric- 
tive in the sense that we might miss some possible types (resp. possible states of 
the world) that are just present in a bigger type space that contains the one we use. 
If this were the case for every type space, then the use of type spaces would be 
problematic from a theoretical point of view, because of the restrictive character of 
this concept, but it would be problematic also from a more practical point of view: 
In their contributions to the debate on epistemic conditions for backward induc- 
tion in perfect information games, Stalnaker [16] and Battigalli and Siniscalchi [2] 
have pointed out that the players do "their best" to rationalize their opponents' be- 
havior if the backward-induction outcome is to obtain. This translates into using a 


388 | M. MEIER 


type space where a player can find the needed types he has to attribute to the other 
players, if he has to explain (i.e., rationalize) the others’ behavior. 

The question concerning “all possible types” can be answered and the related 
problems can be solved if there is a type space to which every type space (on the 
same space of states of nature and for the same set of players, of course) can be 
mapped, preferably always in a unique way, by a map that preserves the structure of 
the type space, that is, the manner in which types and states of nature are associated 
with states of the world, a so-called type morphisms. Such a type space would be 
called a universal type space. If such a space always exists, one could, in principle, 
carry out the analysis of a game of incomplete information in the corresponding 
universal type space without any risk of missing a relevant state of the world. On 
a technical level, the type spaces—on a fixed set of states of nature and for a fixed 
player set—as objects and the type morphisms as morphisms form a category. If we 
always require the map from a type space to the universal type space to be unique, 
then, if it exists, such a universal type space is a terminal object of this category. 
A terminal object of a category is known to be unique up to isomorphism. Hence, 
we are justified to talk about the universal type space. 

The existence of a universal type space was proved by Mertens and Zamir [14] 
under the assumption that the underlying space of states of nature is a compact 
Hausdorff space and all involved functions are continuous. That topological as- 
sumption was relaxed by Brandenburger and Dekel [3], Heifetz [6] and Mertens, 
Sorin and Zamir [13] to more general topological assumptions. Finally, the gen- 
eral measure-theoretic case was solved by Heifetz and Samet [10], who showed 
that there also exists a universal type space in this case. However, in all these arti- 
cles it has always been assumed that the players' beliefs are o -additive. This seems 
to be a rather strong assumption on the epistemic attitudes of the players. 

Savage's postulates [15] imply subjective probabilities that are finitely but not 
countably additive. Given the importance of Savage's theory within decision the- 
ory (i.e., “one-player game theory”), it is natural to ask—and desirable to know— 
what happens if we describe the beliefs of players accordingly in an interactive 
context (games) by finitely additive probability measures? Does there still exist a 
universal type space? 

Still, we are dealing with (now finitely additive) measures, so we have to define 
a field of events. The question arises as to which measurability condition is the 
right one. Should the field of events be just a field, a o-field or should we assume 
that all subsets of the space of states of the world are events, that is, that the field is 
simply the power set? As this question does not seem to have a clear-cut answer, we 
analyze the existence/nonexistence of a universal type space for finitely additive 
beliefs for a whole class of different measurability conditions that include the three 
above-mentioned cases. 

We introduce «-fields, where x is an (usually regular) infinite cardinal number, 
as fields that are closed under the intersection of every set of events (1.e., subset of 
the field) that has cardinality strictly less than x. It follows that No-fields are fields 


FINITELY ADDITIVE BELIEFS 389 


in the usual sense and N;-fields are o-fields in the usual sense. Then, we define 
co-fields as fields that are closed under the intersection of every set of events (of 
cardinality whatsoever). 

We define x-type spaces as type spaces where the set of measurable events in 
the set of states of the world, as well as the set of measurable events in the set 
of states of nature, is a «field, and co-type spaces as type spaces where the set of 
measurable events in the set of states of nature is the full power set and the set of 
measurable events in the set of states of the world is a co-field. Also, we define 
x-type spaces as type spaces where the set of measurable events in the set of states 
of the world as well as the set of measurable events in the set of states of nature 
is the full power set. Furthermore, we define type morphisms, that is, structure- 
preserving maps from one x-type space (resp. oo-type space or *-type space) to 
another (not necessarily different) one. 

Given a nonempty set of players 7, a nonempty set of states of nature S and 
ax-field Xs on SS, we define, similar to Heifetz and Samet [10], a kind of modal 
language, the formulas of which we call x-expréssions. But if x is uncount- 
able, contrary to Heifetz and Samet [10], we allow also for formulas of infinite 
length (but strictly less than x). Then, we collect the x-descriptions (by means of 
k -expressions) of all states of the world in all x-type spaces on 5$ for player set Z. 
Then we show that the set of «-descriptions can be endowed with the structure of 
a k-type space (Proposition 4). In this way, we construct in Section 4 a universal 
k-type space on S for player set J to which every x-type space on S for player 
set / can be mapped by a unique type morphism (Theorem 1). 

As Heifetz |7] has shown, there are consistent hierarchies of finitely additive (in 
fact even o -additive) beliefs up to—but excluding—level œw (i.e., the first infinite 
level), that have at least two different finitely additive extensions to level w. Does a 
similar phenomenon hold also on the higher transfinite levels of consistent hiérar- 
chies? Or, put differently in terms of expressions rather than hierarchies, is there, 
on the contrary, a regular cardinal € such that for all (regular) cardinals x > K, the 
& -description of a state in a x-type space determines already the «-description? If 
this were the case, it would be unnecessary to consider «x-type spaces for x > K 
and we could restrict ourselves to x-type spaces for x < k. We show in Theo- 
rem 3, by using a probabilistic adaptation of the “sober-drunk” example of Heifetz 
and Samet [9], that—with at least two players and two states of nature—this is 
not the case. Hence, it makes sense to consider x-type spaces for every (regular) 
infinite cardinal x. 

Also, this example implies that—again, with at least two players and two states 
of nature—there is no universal co-type space and no universal *-type space (The- 
orem 4 and Corollary 1), even if we do not require the morphisms from the type 
spaces to the universal type space to be unique. 


2. Preliminaries. First, we will define x-measurable spaces, where x denotes 
here an (usually regular) infinite cardinal number. For an introduction to ordinal 


390 M. MEIER 


and cardinal numbers, see [4] or any other textbook on set theory. Then, we will 
develop parts of the theory of «-measurable spaces needed in the sequel, collect 
some known facts about finitely additive (probability) measures and define the 
main objects of our study in this paper, the x-, o0- and *-type spaces. 

An infinite cardinal number x is called regular, if it is not the supremum of a 
set of less than x-many ordinal numbers which are all strictly smaller than x. For 
example, Xo and all the 854, are regular, while &, is singular (i.e., infinite and 
not regular) (Ne = sup{&,,|7 < o])), where œw denotes here the first infinite ordinal 
number. For a set M, denote by |M] the cardinality of M. 

Unless otherwise:stated, a, 8, y,¢,7,& denote ordinal. numbers, ô delta- 
measures, @ functions from the set of states of the world to the set of states of 
nature, « cardinal numbers, A limit ordinal numbers, 44 and v measures, 7 pro- 
jections, 9, x, v expressions and c, apart from above, sets of expressions. For a 
set M, Pow(M) denotes the set of all subsets of M, that 1s, the power set. 

Let x be an infinite cardinal number and M a nonempty set. A x«-field on M isa 
field 3ion M such that € € X and [E| < x imply (] 6 :—( )gcg E € X. It follows 
that 6 C X and |E] < x imply US := [prege E € X. 

Consequently, a x -measurable space is a pair (M, X), where M is a nonempty 
set and ÙX is a x-field on M. 

A set of subsets of a nonempty set is a No-field iff it is a field and it is-a N;-field 
iff itis a o -field. If x’ < x, then every «-field is also a «'-field. 


REMARK 1. Letk bea singular cardinal number and (M, X) be a k-measu- 
rable space. Then X is already a « * -field, where kt denotes the successor cardi- 
nal of «. 


PROOF. Let € CX such that |E|<x«. So, € has the form {Eygla < x}. Let 
K <k be the cofinality of x. Then there is a function f: X — x, such that 
Use f(B) =x. Note that | f (B)] < x, for B <<. It follows that (|, 4, Ea = 
lag (Cla f(g) Eo) € E. Since X is a field, it follows that itis a *-Beld. C] 


Since «* is always regular, the above remark shows that it is redundant to con- 
sider «-fields (x-measurable spaces, resp.) if « is a singular cardinal. 

Let M be a nonempty set. A co-field on M is a field © on M such that 6 C 
X implies NE :— [geg E € X. Again, it follows that. € C X: implies | JE :— 
Us EG Ec Y. 

Accordingly, a co-measurable space is a pair (M, X), where M is a nonempty 
set and È is a co-field on M. 

. A *-measurable space is a pair (M, Pow(M)), where M is a nonempty set. 

Note that every -*-measurable space is a oo-measurable space and every 
oo-measurable space is a x-measurable space for every infinite ordinal x. 
A co-measurable. space (M, X) is a *-measurable space iff for all m zz m’ € M 
there is an E € X such that m € E and mw ¢ E. 


FINITELY ADDITIVE BELIEFS 391 


EXAMPLE l. Let M = {0,1}, E — (9, M). (M, X) is co-measurable, but not 
*-measurable. 


DEFINITION 1. Let M be a nonempty set and ¥ a field on M. A finitely 
additive measure on (M, F) is a function u: F — R U {roo}, such that: 


(i) 0x uF), for all FEF, 

(i) WCE UF) —pu(E) + uF), for all disjoint. E,FEF. 
u is a finitely additive probability measure on (M, F), if in addition 
(i) u(M)=1. 


DEFINITION 2: Let M be a nonempty set, F a field on M, p a finitely additive 
measure on (M, F), and È C M. 
We define the outer measure of E induced by u as 


A" (E) : infu (F)|F € F such that E C nn 
and the i inner measure of E induced by y as 


UE) : m sup{u(F)|F € F such that F C E. 
If not stated otherwise, we keep the following. 


.. CONVENTION 1. K If (M, £) i isa k -measurable space, then A* (M, X.) de- 
notes the space of finitely additive probability measures on (M, =). We consider 
this space itself as a x-measurable space endowed with the x«-field £ar generated 
by all the sets {u € A“ (M; X)|u(E) = p), where E €X and p c [0, 1]. 

(ii) Similarly, we denote by. A(M, Pow(M)) the set of all finitely additive prob- 
ability measures on (M, Pow(M)). 


Of course, (i) of this convention depends on the particular x chosen. 


- REMARK.2. Let (M', 3) and (M,X) be k-measurable spaces and let 
f: M' — M be measurable. Then: 


(a) If w is a finitely additive probability measure on (M!, X^), then u (f 1 ()) 
[.e,u (f (E), for E € X] is a finitely additive probability measure on (M, Z). 

(b) If A‘: A*CM', E’) — A" (M, E) is defined by AS (u^) := wf C), for 
wW € A*(M', X), then AX y is measurable, since we have A (UE) = p iff 
u Cf (E) = p.for E € X. 


REMARK 3. Let (M', X) and (M,X) be «measurable spaces and let 
f: M' — M be measurable and onto. Then: 


(a) f Kx) = {fO(B)E € È} is ax«-field on M’ ind a subset of X. 


392 M. MEIER 


(b) If u is a finitely additive measure on (M, X), then u induces a finitely addi- 
tive measure u/ on (M', f—!(X)) defined by w (FTI (Ey) := (E). Furthermore, 
if u is a finitely additive probability measure, then u' is a finitely additive proba- 
bility measure. 


LEMMA 1. Let y <a be ordinal numbers. For y < B < a let (MË, FÊ) be 
a No-measurable space (i.e., MP is a nonempty set and F? is a field on MÊ) and 
uP a finitely additive probability measure on (MP, £P y, let M* be a nonempty 
set, and for y SE «t <a let fes: MË > M$ be onto and, if t < a, let fe t be 
F$ — F3 -measurable, such that: 


l. fep o fg.c = fer forall E « B «c suchthaty «&& «B <t <a, 

2. uP fr g(E5)) = n5 (E$), for all < B such that y <E « B <a and 
all E* € $5. 
Then: 


(a) Uy<p<a fg, (7 ^) is a field on M*, 

(b) (u£ )y<p<a induces a well-defined finitely additive probability measure 
ut on (f* Uy cpa fi (FP), defined by n" (fl (EP) := uP (EP), for 
E? EFP. 


PROOF. That L^? is well-defined follows from the above conditions 1 and 2 
and the fact that the fg q’s are onto. In light of the preceding remark, the rest is 
clear. Lj 


NOTATION 1. Let M be a nonempty set, a field on M, and E € M. Then 
denote by [F , E] the set of all subsets of M of the form (LN EJU (N 1M N E)), 
where L, N € F. Itis easy to check that [F , E] is the smallest field that extends F 
and contains E as an element. 


For further reference, we cite the following two lemmas (in a somewhat dif- 
ferent form), which are theorems by Los and Marczewski [12] and Horn and 
Tarski [11]. 


LEMMA 2. Let M be a nonempty set, F a field on M, E C M, p a finitely 
additive probability measure on (M, F), u, (E) the inner measure of E, u* (E) 
the outer measure of E, and p a real number such that u, (E) x p < u* (E). 

Then there exists a finitely additive probability measure v that extends p to the 
field {F , E] such that v(E) = p. 


PROOF. Follows directly from Theorem 2 of [12]. L1 


Sometimes, we will refer to the above lemma as the “Los—Marczewski theo- 
rem." 


FINITELY ADDITIVE BELIEFS 393 


LEMMA 3. Let Fi € Fa be fields on the nonempty set M and let u be a finitely 
additive probability measure on (M, Fi). Then there exists an extension of u to a 
finitely additive probability measure v on (M, F2). 


PROOF. Follows from point (i) of Section 4 of [12] and also from [11], 
page 477, Theorem 1.21. L1] 


3. Type spaces. For this section, unless otherwise stated, we fix a regular car- 
dinal x. Furthermore we fix a nonempty set of players J, a nonempty set of states of 
nature §, and, unless otherwise stated, a x-field Es on S, such that for all s,s’ € S 
with s Æ s! there is an E € Xs such that s € E ands’ ¢ E. 

We define now x-type spaces, oo-type spaces and x-type spaces, that is, the 
objects which we will study in this paper. 


DEFINITION 3. A «-type space on S for player set I is a 4-tuple 
M = (M, 245 (Thier, 0), 
where: 


(a) M is a nonempty set, 

(b) E isax-field on M, 

(c) fori € 1:7; is a È — XA«-measurable function from M to A*(M, X), 
the space of finitely additive probability measures on (M, X), such that for all 
m € M and A € X:[T;(m)] € A implies T; (m)(A) = 1, where Hh (m)] := {m € 
M Ti (m^) = Ti (m)), 

(d) 0 is a 3; — X:g-measurable function from M to S. 


This structure is interpreted as follows: M is the set of states of the world. Such 
a state determines completely the objective parameters of the players' interaction, 
that is, the state of the nature 0 (m), as well as the players’ beliefs about the true 
state of the world. In general, in a state of the world m € M, player i will not know 
the true state of the world m; he will just have a probability measure 7; (1n) over 
the set of states of the world. 7; (mm) describes his beliefs in state m, that is, the 
type of player i in state m. [Knowing m would mean that 7; (m) = ôm, where ô 
denotes the Kronecker delta.] 0 (m) is the state of nature that corresponds, to the 
state of the world m. While there might be many states of the world to which a 
given state of nature s € S corresponds, we have that to every state of the world 
there corresponds one and only one state of nature. This is expressed by the fact 
that 0 is a function 0: M — 5. 

We will refer to the property that for all i € I,m € M and A € X:[T;(m)] C A 
implies 7; (m)(A) = 1, as the introspection property of k-type spaces (co-type 
spaces and x-type spaces, resp., see below). This expresses the self-consciousness 
of the players: In a state of the world m a player does not attribute a positive 


394 M. MEIER 


probability to states where he has a different belief from the belief he has in the 
present state m. 

Doing obvious changes, the proofs (of the theorems in this paper) would go 
through, if we were to abandon this property; in fact, things would be easier then. 


DEFINITION 4. A co-type space on S for player set I is a 4-tuple 
M (m (M, 3, (Ti)ier, 0), 
where: 


(a) M is a nonempty set, 

(b) & is a co-field on M, 

(c) fori € I: T; isa measurable function from (M, 22) to A? (M, X), the space 
of finitely additive probability measures on (M, 2X), endowed with the co-field 
generated by all the sets {u € A® (M, X))|u (E) > p), where E € X: and p € [0, 1], 
such that for all m € M and A € X::[T;(m)] C A implies T; (mm)(A) = 1, where 
[7; (m)] :— {m € M|T; (m^) = Ti (m)), 

(d) 0 is a X — Pow(S$)-measurable function from M to S. 


Note that for u 4 v € A?" (M, =) there is an E € ÈX and a p € [0, 1] such that 
(E) > p and v(E) < p. This implies that the oo-field of A°°(M, X) is in fact 
Pow(A??(M, 22), the full power set. Hence, by the measurability of 7;, we have 
[7;(m)] € È. So, in fact, the condition that [T;(m)] C A implies T; (m)(A) = 1 
reduces to 7; (m)([T; (m)]) = 1. 

By the definitions, it is obvious that every co-type space on 5 is a x-type space 
on S, for every regular «. [Set Xs :— Pow(S) in the x-type space.] 


DEFINITION 3. A «-type space on S for player set I is a 3-tuple 
M :— (M, Tier, 9), 
where: 


(a) M is a nonempty set, 

(b) for i € 1:7; is a function from M to ACM, Pow(M)), the space of fi- 
nitely additive probability measures on (M, Pow(M)), such that for all m € 
M : Ti (m)([Ti (m) — 1, where [T;(m)] := {m € M|Ti (m^) = Ti (m)). 

(c) 0 is a function from M to S. 


Equivalently, a *-type space M can be written as M = (M, X, (Ti)ic1, 9), where 
x = Pow(M). So, every *-type space on S is a co-type space on S. And therefore, 
itis also a x-type space on S. 

We define now the beliefs-preserving maps between type spaces. 


ad 


FINITELY ADDITIVE BELIEFS 395 


DEFINITION 6. Let M' = (M', Y, (T;)ier, 6") and M = (M, X, (Ti)ier, 9) 
be x-type spaces (co-type spaces, *-type spaces, resp.) on S for player set I. 

A function f : M! — M is a type morphism if it satisfies the following condi- 
tions: 


l. f is E’ — Xi-measurable, 
2. for all m’ € M*: 


0' (m^) = 0(f (m^), 
3. forallm' e M', Ee X andi € F: 
T; Cf (n) (E) = T; (m)(f (8). 


Note that the above definition of a type morphism does not depend on x; that 
is, if « x’, and M and M’ are «'-type spaces (oo-type spaces, +-type spaces, 
resp.), then f : M! — M is a type morphism from M’ to M viewed as «'-type 
spaces (co-type spaces, *-type spaces, resp.) iff it is a type morphism from M’ 
to M viewed as x-type spaces. Similarly, if M’ and M are x-type spaces, then 
f:M'— M is a type morphism from M’ to M viewed as x-type spaces iff it is a 
type morphism from M' to M viewed as co-type spaces. (Note that in the case of 
x-type spaces, every function f : M' — M is measurable.) 


DEFINITION 7. A type morphism is a type isomorphism, if it is one-to-one, 
onto, and the inverse function is also a type morphism. 


It is easy to see that a function f : M’ — M is a type isomorphism iff it is a type 
morphism and isomorphism of the measurable spaces (M’, X^) and (M, X). 
An easy check shows: 


REMARK 4. «x-type spaces on S for player set I (co-type spaces, x-type 
spaces, resp.), as objects, and type morphisms, as morphisms, form a category. 


DEFINITION 8. (i) A «-type space €2 on S for player set J (co-type space, 
x-type space, resp.) is weak-universal if for every x-type space M on S for player 
set J (co-type space, x-type space, resp.) there is a type morphism from M to Q. 

Gi) A x-type space (2 on S for player set J (co-type space, x-type space, resp.) 
is universal if for every «x-type space M on S for player set J (co-type space, 
x-type space, resp.) there is a unique type morphism from M to Q. 


To keep the terms of the already existing type space literature, we use the term 
"universal type space," although in the language of category theory the term “ter- 
minal type space" would be more adequate, since the universal type space is a 
terminal object in the category of type spaces. 


396 M. METER 


PROPOSITION 1. Jf they exist, universal k-type spaces on S for player set 
I (co-type spaces, *-type spaces, resp.) are unique up to type isomorphism. 


PROOF. If Q and U are universal x-type spaces (oo-type spaces, *-type 
spaces, resp.) (on the same space of states of nature and for the same player set, 
of course), then there are type morphisms f:U — Q and g: 2 — U. It is easy 
to check that the composite of two type morphisms is also a type morphism and 
that the identity is always a type morphism from a x-type space Q (co-type space, 
x-type space, resp.) to itself. By the uniqueness, it follows that g o f =idy and 
therefore f is one-to-one and g is onto, and f o g = ido and therefore g is one- 
to-one and f is onto. f and g are type morphisms and f = g^! and g = f7!. 

[] 


To prove the existence of a universal x-type space on S for player set J is the 
goal of the next section. 


4. The universal x-type space in terms of expressions. Again, for this sec- 
tion, unless otherwise stated, we fix a regular cardinal «, a nonempty player set 7 
and a x-measurable space of states of nature (S, Xs) such that for all s, s € S with 
ss’ there is an E € Eş such that s € E ands’ ¢ E. 

Given these data, we define «-expressions (allowing also for infinite conjunc- 
tions) which are natural generalizations of the expressions defined by Heifetz and 
samet [10]. These are formulas that describe events defined solely in terms of na- 
ture and the players' beliefs. Expressions are defined in a similar fashion as, for ex- 
ample, the formulas of the probability logic of Heifetz and Mongin [8]. Analogous 
to Heifetz and Samet [10], given a x-type space on S for player set J and a state 
of the world in this type space, we define the «description of this state as the set 
of those x-expressions that are true in this state of the world. Then, we show that 
the set of all «-descriptions constitutes a x-type space (Proposition 4) and that this 
k-type space is the universal x-type space (Theorem 1). 


DEFINITION 9. For a x-type space (M, X, (Ti)iej, 0) on S for player set J, 
i € I, E e Xi and p €[0, 1] define 
B? (E) := (m € M|T;(m)(E) > p). 


Note that B? (E) = T; ((u € A*(M, X)|u(E) > p}) and that B? (E) € X, if 
E € X. 


DEFINITION 10. Given a x-measurable space of states of nature (S, Xs) and 
a nonempty player set F, the set ®* of x-expressions is the least set such that: 


1. every E € Ys is a x-expression, 


FINITELY ADDITIVE BELIEFS 397 


2. if ọ is a k-expression, then —9 is a x-expression, 

. if g is a x-expression, then B? (g) is a x-expression, for i € I and p € [0, 1], 

4. if V is a nonempty set of x-expressions with |W| < x, then Accu 9 is a 
K-expression. 


U 


If W is a nonempty set of x-expressions with |Y | < x, then we set Vpey 9 :— 


ma Ayew “E. 
Since we work here with a fixed regular «, we omit sometimes the superscript x. 


DEFINITION 11. Let M :— (M, X, (Ti)ier, 9) be a x-type space on S for 
player set 7. Define: | 
1. EM := 071 (E), for E € Xs, 
2. (ng) :— M Vo, for o € Ò“, 
3. (BY (g)) M :— B? (p, for o € ®*, i e I and p € [0,1], 
4. (Aoew 9) :— New 9, for Y such that SAV C O* and |Y] <x. 
So, defined as above, x-expressions define measurable subsets of M. It is easy to 
check that (V pew go)! = Uvew o™, for V such that 2 AW C &* and |W] <x. 


If no confusion may arise, we omit—with some abuse of notation—the super- 
script M. 


‘DEFINITION 12. For a «-type space M :— (M, &, (Ti)ier, 0) on S for player 
set 7 and m € M define D“ (m), the x-description of m, as 
D* (m) := (9 € “|m e 9M. 


Again, we omit sometimes the superscript «. 
The next proposition says that type morphisms preserve « -descriptions. 


PROPOSITION 2. Let (M, X, (Ti)ier, 0) and (N, UN, (TA Jier, ON) be x-type 
spaces on S for player set I and let f : M — N be a type morphism. Then, for all 
m c M: 


D(f (m)) = D(m). 


PROOF. We show by induction on the formation of the expressions that m € 
p% iff f(m) e 9»: 


(a) Let E € Ds. We have 0" (f (m)) = 0(m), so f (m) € EX iff m e E. 
(b) We have 


f(m)eQCg9* if f(m)£og* iff mgo” iff meo. 


398 M. MEIER 


(c) Let V be a nonempty set of expressions with |W| < x. Then 


N 
rome (A p) iff for all o € V : f (m) € eX, 


QEF 


which is by the Induction hypothesis the case iff for all o € W:m € g%, which is 
the case iff m € (Acc P)“. 
(d) We have 


fim) e (GP (g)* iff TX Fm) @D= p if Tf ED zp. 


By the Induction hypothesis: f~!(@% = 9M. Hence T;(m)(f- i(gM)) = 
T; (m) (o. We have T; (m) (pM) > p iff m e (BP (q)). It follows that 


fm) € GP (g)*. iff m e (BP'(g) 4. o 
DEFINITION 13. Define Q“ to be the set of all x-descriptions of states of the 
world in x-type spaces on S for player set 7. For o € ®* define 


[9] :— {w € Q p € c]. 
Again, we omit sometimes the superscript x. 
REMARK 5. The set of «-descriptions Q“ is nonempty. 


PROOF. Let M :— {m} and choose s € S. Set & := Pow(M), T; (m) := dm, for 
i € I, and 0(m) := s. Then 
(M, 255 (Thier, 0) 
is a x -type space (even a *-type space) on 5$ for player set J and hence D(m) € Q“. 
[] 


Obviously, we have Q \ [o] = [^o] and yey [V] = [Ay ew V], where ¢ is a 
«-expression and ¥ is a nonempty set of «-expressions with |W| < x. It follows 


that: 


REMARK 6. The set 
Xe := ([ve]le e $} 
is a «-field on Q. 


LEMMA 4. For every x-type space M on S for player set I and for every 
q € ©“, the k-description map D: M — Q satisfies 


D [o] = oM. 


FINITELY ADDITIVE BELIEFS 399 


PROOF. Clear by the definition of [p]. Li 
Note that Lemma 4 implies that D is measurable. 


PROPOSITION 3. For every i € I there exists a function 
TŽ: Q — A*(Q, Eo) 


such that for every k -type space M on S for player set I with x-description map D 
and every m € M: 


T;*(D(m))(-) = T;(m)(D7'()). 


PROOF. For € Q choose a «x-type space M on S for player set J and m € M 
such that D(m) = o. For [p] € Eg define 


T7 (o) (I9) = Ti (m (D^ (1g). 
We have 
T; (m) D^ ([9])) = Ti (m) (e ^) = sup{p|B? (g) € D(m)), 
so T,*(w)([~]) depends just on D(m) and is well-defined. By Remark 2, we have 


T;(m)(D~"(-)) € A*(Q, Ea). o 
LEMMA 5. There is a measurable function 0*:Q — S such that for every 
k-type space M on S for player set I and every m € M: 
90*(D(m)) = 0(m). 


PROOF. .Let 
do(m) :— (E € Xslm € 071 (E)). 


Obviously, do(m) = D(m) N Xs. By the properties of (S, Xs), we have for all 
s € S:(s) =(sczen, E. It follows for every x-type space M' on S for player 
set J and m’ € M’ that 

O(m')=s iff do(m-(E|sc E). 


For œ € Q choose a x-type space M on S for player set J and m € M, such that 
D(m) = w. Define now 6*(w) :— 0 (m). Since 0(m) just depends on D(m), 8* (o) 
is well-defined. 

It remains to show that 0* is measurable: Let E € Xs. We have 


0*(D(m)c€E iff meo (E) iff EeD(m) iff D(m)e[E]. 
It follows that 0*-! (E) -- [E]. O 


400 M. MEIER 


PROPOSITION 4. 
(Q2, XQ. (Tier, 0*) 


is a k-type space on S for player set I. 


PROOF. It remains to show: 


1. For every i € I : 7;* is measurable as a function from Q to A“ (Q, Ego). 
2. For every i € I, w € Q and A € Xo: If 


{w € QT (o) = T;  (v)) € A, 
then 7;*(@)(A) = 1. 


1. Since inverse images commute with unions, intersections and complements, 
it is enough to show that "^! (bP (E)) € Eg, for 


b? (E) := {u € A*(Q, Eo)lu(E) = p}, 
where E € X and p € [0, 1]. We have 
T* (DP (E)) = {@ e Q\T;*(@)(E) > p]. 


Since E € Xo, there is a «-expression g such that E = [ø]. Note that if p € [0, 1] 
and p = sup(g|B? (p) € œ}, then B? (p) € œ. This implies that 


e cT" (bP) iff BP (p) eo. 


It follows that T^ (bP (E)) = [BP ()]. 
2. Let o be a «-expression and 


(o € Q|T7 (w) = T; («)) € [v]. 


Choose a x-type space M on S for player set J and m € M such that D(m) = o. 
Let m’ € M. If T (D(m)) z T;*(D(m)), then there is a x-expression W such that 
Ti (m^) (Ww) z T; (m)(y 9). It follows that ` 


D((m' € MIT; (m^) = Ti(m))) € (e! € QT; œ) = T (0), 
which implies 
(m' e MIT; (m) = T;(m)} € D^ (lo) = eE. 
So we have 


1 = T; (m) (9D = TMD (Ee) = T7 (»)([eD. o 


LEMMA 6. The k-description map 
D:Q—>Q 
is the identity. 


FINITELY ADDITIVE BELIEFS 401 


PROOF. Foro e $2, we have 
w = {p € low € [¢]}. 


We have to show that for every k-expression 9 and every w € 2:w € q iff 
o E€ [ø]. We know this already if 9 = E, where E € Ys. It is obvious that 
Q X [g] = [^], and that if V is a nonempty set of «-expressions of cardinal- 
ity < x, then 


N= A 7: 


geV peP 


So it remains to show that [p] = e£ implies [BP (9)] = B? ([p]). For o € Q, 
choose a «-tvpe space M on S for player set J and m € M such that D(m) = o. 
We have 


D(m)€[B/(g)) iff BP'(g)eD(m) iff Ti(m)(o = p. 
But we have 
T; (wg) = T (mD (Ig) = Tn) (yp). 
This implies that [B^ (g)] = B? ([pD. O 


THEOREM 1. The space 
(Q, Xo, (Tier, 0*) 


is a universal k-type space on S for player set I. 


PROOF. According to Lemma 4, for every «x-type space M on S for player 
set I, the x-description map D: M — €2 is measurable, and according to Proposi- 
tion 3 and Lemma 5, D is atype morphism. It remains to show that it is the unique 
type morphism from M to Q. But this is clear by Proposition 2 and Lemma 6. O 


5. Spaces of arbitrary complexity. Is there a cardinal x, such that the 
x-descriptions determine already the «'-descriptions, for all cardinal numbers 
x’ > xK? In the sequel, using a probabilistic adaptation of the elegant “sober-drunk” 
example of Heifetz and Samet [9] (see that paper also for the "story" interpreting 
the mathematical structure), we construct, for every regular cardinal x’, a «'-type 
space (in fact even a *-type space), such that for every ordinal o < x’ there are at 
least two states of the world such that for every «'-expression of depth < o this 
k'-expression is true either in both states or in neither of the two, and yet there 
is a x’-expression of depth a + 1 that is true in one state and not in the other. 
Thus, choosing o and «/ with x x à « a + 1 < «', we answer the above question 
in the negative. Hence, it makes sense to consider x-type spaces for every regular 
cardinality x whatsoever. 


402 . M. MEIER 


On top of that, this example will imply that, for at least two players and at least 
two states of nature, there is no universal *-type space and no universal oo-type 
space (Theorem 4 and Corollary 1). 

For this section, let J :— (a, b) be the set of players (the following analysis can 
be trivially extended to more than two players). We fix a set of states of nature 
S = {h, t), consisting of the two possible outcomes of tossing a coin, h(ead) and 
t (ail). 

To simplify the notation let us make the following 


CONVENTION 2. (i,j:— (a, b}, that is, 


The following three definitions and Definition 19 are taken from Heifetz and 
samet [9]. 


DEFINITION 14. Leto > 1 be an ordinal. A record of length o is a se- 
quence r” = (r(B))g<q of numbers "0" and “1” such that for every limit ordinal 
A <q there is an ordinal y < A such that r (f) = 0 for all ordinals £ that satisfy 
y<Bp<A. 


For every infinite ordinal y there are a unique natural number n and a unique 
limit ordinal X such that y =A + n. We say y is even or odd according to whether 
n is even or odd. If y is a finite ordinal, that is, a natural number, we take the usual 
notion of being even or odd. 


DEFINITION 15. Let « be an ordinal, r” a record of length œ and A a limit 
ordinal < a. By the definition of a record, there is a minimal ordinal o^(r*) « X 
such that r% (8) = 0, for all B with o^(r*) < B <A. 

Define A — par(r^), the A-parity of r^, as 


P AA [RO § 
| ay. ] even, if o^ (r^) is even, 
pa) | odd, if o^ (r*) is odd. 


Note that by the definition of a record, o^(r*) must be either 0 or a successor 
ordinal [i.e., o^ (r*) = y + 1, for some ordinal y]. 


DEFINITION 16. Let o be an ordinal. Define the spaces W® by: 


(i). W° := (h, 1), 
(ii) W* :— ((wo, w2, w2)|wo € {h, t], wz and w% are records of length a}, if 
a>. 


FINITELY ADDITIVE BELIEFS 403 


DEFINITION 17. (i) If0 < B x o and r* = (r(&))¢ <q is a record of length a, 
then denote by r^ [8 the record (r(&))¢<, of length B. 
(ii) I£ 0 x a and w” e W7, then define w^ [0 :— wo. 
(ii) I£ 0 < 8 x a and w* € W*, then define w” [£ := (wo, w7[B, wp T). 


By the definition, it is obvious that w% [B € WP, for every B <a. 


DEFINITION 18. Let O0 < B <a. Define 
Tga: W* — WP 
by zg,4 (w^) :— w[B. 

It is obvious that 7 a (T8, a (w^)) = tgo (w^), for0 <E € B <a. 

REMARK 7. Let 0 «x B <a, w? e WP, and i € (a,b). Then there are w”, 
u” c W* such that 

w*"[B-—u*"[B— wP and O0-wf(B)zuf(B)-l. 
In particular, it follows that Tg œ: W^ — WÊ is onto. 

We define for each player i a partition of the space W". Two states are in the 
same element of player i’s partition if he cannot distinguish them. That is, he has 
the same information (the same beliefs) in both states. Let w” = (wo, w7, wp) be 
a state in W*. The element of i’s partition that contains this state is defined as 
follows: 

DEFINITION 19. Let « be an ordinal > 0 and w* e W*. We define: 

Pi(w^) := ((vo, vg, vp) € W^]u = wj, 
w; (0) = 1 implies vo = wo, 
for all P such that 8 +1 « o: 
wf (B + 1) = 1 implies v? (B) = w? (f), 
for every limit ordinal À < a: 
w; (A) = 1 implies A — par(v;) =A — par(w5)]. 

REMARK 8. (i) Let a be an ordinal > 0. The set (P;(w*)|w* e W?) isa 

partition of W* and u* € P;(w?). 


Gi) Let 0 < B <a and u* € Pj(w*). Then u* [8 € P;(w*[B), and hence 
15, (Pr(w* [B)) > Pr(w*). 


404 M. MEIER 


It is easy to see that if o is an infinite ordinal number, then the cardinality of 
W^? is the same as the cardinality of œ. [To see that, in the case of an infinite a, 
the cardinality of W% does not exceed that of a, note that the definition of a record 
implies that there are only finitely many f < o such that 7" (B) = 1. Consider, 
assuming the contrary, the minimal y < o such that there are infinitely many f < y 
with r^ (8) = 1.] Therefore, we have: 


REMARK 9. Letk bean infinite cardinal number. Then |W* | =x. 


NOTATION 2. (a) For 0 x a and wọ € {h, t], we denote by [Xp = wo] the set 
(u^ € W? |ug = wo}. 
(b) For 8 < a, i € {a,b} and w; (8) € (0,1), we denote by [X7 (B) = 
w? (B)] the set (u* € W” |u? (B) = w?(B)}. 
(c) For a limit ordinal à < œ and A — par(w?) € (even, odd}, we denote by 
[A — par(X7 ) = A — par(w?)] the set (u^ € W"|A — par(u7) = X — par(w; )). 
REMARK 10. Let0 x a <y and wo € {h, t). Then: 
G) rzi (X? = wol) — [Xj = wol. 
(i) If B <a,i € (a, b) and w” € WY, then 
wi (B) = (w; [a) (f) 
and 
Te, (EX? (B) = (wj [a)(8)]) = UX7 (8) = w; (B). 
(iii) Ifi € (a, b), wY € WY, and if À is a limit ordinal such that à < a, then 
À — par(w]) = X — par(w? Jæ) 
and 
Ty (D. — par(X7) = X — par(w? [0)]) = [A — par(X7) = A — par(w; )]. 
Now, let x be a fixed regular cardinal. 
THEOREM 2. Fori € (a,b),thereisafunction T; : W€ —> A(W*, Pow(W*)) 


with the following properties: 


(a) TU) = T; w^), for u“ € P(w"), 
D) nw )-l 
ON _ fi, ifwi(9)—1, 
©) W= wo = | 1 pura) =o. 
(d) for B <k: 
Loo ifwf(B+1)=1, 
ew (XG) = GOD ={1 ^ urs 1) a0 


FINITELY ADDITIVE BELIEFS 405 


(e) for X « « such that 4. is a limit ordinal: 
1, | ifwiQ)-l, 
(f) for0 x B «a «x andu“, w* € W*: if EP C WP and u“ [a = w“ fæ, then 


T; (u ) (5L (EP) = Ti (w) (rg CEP). 
Theorem 2 will be proved in Section 6. 


REMARK 11. For w" e W", define 
9 (w^) := wo. 


From the first two points of Theorem 2 and the fact that T; (w* ) is a finitely additive 
probability measure defined on (W* , Pow(W*)), it follows that 


(W*, (Ti)ie(a,). 0) 
is a x-type space on S = (h, t) for player set (a, b}. 
Next, by induction on the formation of x-expressions, we define the depth of a 


«-expression. The depth of a x-expression is an ordinal number that measures how 
complex that expression is with respect to the players' beliefs operators. 


DEFINITION 20. (i) If E € Xs, then dp( E) := 0, 
Gi) ifO0< p<1,i¢€/ andif ¢ is a x-expression, then 


dp(B? (9)) := dp(v) + 1, 


(iii) if g is a x-expression, then dp(^9) :— dp(g), 
(iv) if V is a set of x-expressions such that |W| < x, then 


zi A v) := sup{dp(p)lp € VJ. 


pey 


+ 


It is easy to see that, since x is regular, the depth of a «-expression is always 
strictly smaller than x. 
The following Lemmas 7--9 will be proved in the Appendix. 


LEMMA 7. Leta <x, w*,u* € W" and w" [a =u" [a. 
Then, for all k-expressions o such that dp(g) < a: 


K : 
w'eg" if eg. 


406 M. MEIER 


LEMMA 8. Jn the x-type space (W* , (Ti)ie(a,p), 0), we have: 
[Xf (0) = 1] = Bj (XG — hD UB} (IX =2)), 
[X7 (B + 1) = 1] = Bj (IX5(8) = 1]) U B1 (X5 (8) — 0]) 
for all ordinals B < x, 
[X7 Q) = 1] = B} (D. — par(X 5) = even]) U B} [A — par(X*) = odd]) 
for all limit ordinals X < x. 
LEMMA 9. Inthe *-type space (W* , (Ti)ie(a,p), 0). we have: 
(i) 
(h)" = [X5 =h], 
(g" = [XG zt, 
dp({h}) = dp({t}) — 0. 


(i) For every i € {a,b} and B such that 0 < B < x, there are x-expressions 
9» (B) and o} (B) with 


dpl? (B) = dp(o] (£)) ^ 8 +1 
such that 
(el 8) =X (8) = 1] 
and 
(Pe) = [XE (4) =O]. 


(iii) For every i € (a, b) and limit ordinal X < k, there are k-expressions 
p (A) and q£4(4) with 


dp(yp"""(A)) = dpo A) =A 
such that 
(g^ Q)) = [A — par(Xf) = even] 
and 


(94a) = [A — par(X*) = odd]. 


THEOREM 3. For every ordinal a < x there are u“, w* € W* such that: 
1. For all k-expressions ọ with dp(q) x a: 


K . K 
u eg" if wego. 


FINITELY ADDITIVE BELIEFS 407 


2. There is a k -expression sy with dp(w) =a + 1 such that 


uev and w ey). 


PROOF. Let a <x and i € {a,b}. By the definition of W“, there are u“, 
w“ € W* such that u“ [a = w“ [o and 1 = uf (a) A wj (œ) = 0. The first point 
follows now by Lemma 7. By Lemma 9, it follows that u“ € p} (a)“" w* € 
(791 (a))"", and dp(g](a)) 2a --1. O 


Note that Lemma 7 and the proof of Theorem 3 show that uo and the levels up 
to and including level a of uf and of u% determine which of the «-expressions of 
depth o + 1 belong to the «-description of u“. 


THEOREM 4. Let |I| > 2 and |S| > 2. Then, there is no weak-universal 
co-type space on. S for player set 1 and there is no weak-universal *-type space 
on S for player set I. 


PROOF. Assume there is a weak-universal oo-type space (a weak-universal 
*-type space, resp.) 


U — (U, X, (TY )ier, 0") 


on S for player set J. Then, the underlying set U has a cardinality |U |. There is a 
regular cardinal number x > |U]. 


W“ z (W* . (Ti)ieta,b), 0) 


is a *-type space on (A, t) (and therefore a co-type space). Since |{h, t}| = 2, we 
can assume without loss of generality that (A, t) € S, and since Ls = Pow(5), that 
us > Pow({h, t). Also, since |Z| > 2, we can assume without loss of generality 
that (a, b) C I. Fori € I V (a, b) define T;(w*) :— ày«, and view 0 as a function 
from W* to.S. Then 


Wi S (W*, (TiJier. 8) 


is a *-type space on `S (with player set 7). Since every *-type space is a co-type 
space, W^ is also a co-type space. According to the assumption, there is a type 
morphism f :W* —> U. Since both spaces are in particular x-type spaces, this 
morphism preserves «descriptions. If g is a x-expression in the "language" cor- 
responding to the set of states of nature (A, t} and the player set (a, b}, then ¢ is 
also a x-expression in the "language" corresponding to the set of states of na- 
ture S and the player set J, and it is easy to check that for w* e W* we have 
w* € oV iff w* € o. So, by Lemma 9, it is still the case that two different 
states of W^, have different «-descriptions. Hence, since by Proposition 2, f pre- 
serves «-descriptions, f is one-to-one. It follows that |U| > |W*| =x, which is a 
contradiction to |U| <x. O 


408 M. MEIER 


COROLLARY 1. Let |I| > 2 and |S] > 2. Then there is no universal co-type 
space on $ for player set I and there is no universal *-type space on S for player 
set I. 


6. Construction of the 7T;'s. This section is devoted to the proof of Theo- 
rem 2, that is, the construction of the 7;'s mentioned there. Lemmas 10 and 12-17 
needed for this construction are proved at the end of this section. 

The construction will not be carried out at once. By a transfinite induction on 
] <a x x, we endow W* with fields ¥ (i, w^) and finitely additive probability 
measures T,” (w^) on (W%, F (i, w^)) such that the following Induction hypothesis 
is satisfied: 


INDUCTION HYPOTHESIS (for a). 
1. FG, w”) :— [Ug <a (T3  (Pow(WP))), Pr (w?)]. 
2. For every ordinal. with 1 < B < œ and every E? e F (i, w* [B): 
T? (w*)Gr5 1L (EP)) = TP (w* [8) (EP), 


that is, margewe, zi, uepgy 7; (W) = TP (w^ [B). 
. T? (w*) = T? (u^) for u” € Pj(w%). 
. T (w*)(P;(w%)) = 1. 
1, if w* (0) = 1, 
Q fapa T I 
s T; (w Y(LXp = wo]) = | h, if w® (0) =0 
. For £ such that 8B +1 <a: 


ON UA FW 


1, if w? (B + 1) = 1, 
CTA: a — = i 
T; (w XIX 5 (8) = wj (B)]) n if w*(B + 1) « 0. 
7. For à < a such that A is a limit ordinal: 


x " ği u 1, if w (à) = 1, 
Ti WA para p = A — pan p | 3, idfw'Q)-0. 


Let 1 < y <a. Then Ug. 75.1 (Pow(W*)) is a field on W®. Hence, by defini- 
tion, F (i, w“) is also a field on We, Since inverse images commute with comple- 
ments, arbitrary unions and intersections, we have: 


REMARK 12. Letl<y <a. Then, 
y GF G, w)) = [U Tp (Pow(WP)), 1 (P; w) 


B«y 


is a field on W*. In particular, 


ms (FG, w* T8 +1) S [n5 (Pow(W^)), A ,UPiQU? [B 4- 1)] 


FINITELY ADDITIVE BELIEFS 409 


is a field on W%, for B +1 < a. Note also that n, aie (i, w” yc C F (i, w”), for 
l<y xa. 


REMARK 13. Letl-xp xa xx and let u”, w% € W* such that u^ € P;(w?) 
[and hence P; (u*) = Pj(w*)]. Then: 
(a) We have 
$ (i, w*) = F (i, u*), 
F (i, w"[B) = F (i, u [B). 
(b) Jf 
T? (w^) = T; (u^), 
TË (u* [B) = Tf q^ TB), 
marg? s, uepy TP Q^) = T; (w^ TB), 
then we obviously also have 
Marg we, eq uepgy T; (4 ) = TË (u* TB). 


Before we begin the construction of the 7;'s, we have to provide some lemmas 
that are needed to carry out this construction. These lemmas will guarantee that the 
induction can be done maintaining the conditions 1—7 of the Induction hypothesis. 
They will be proved at the end of this section. 


LEMMA 10. Let y bean ordinal > 0, à — y +1, w € W”, and 


€ | U Tg 1 (Pow(W? )), Ty, 1 (P; (w* »)| P; ^ 
B«y 
Then there are a B < y and Ag Cg, Dg € Pow(W®) such that 
E= (mol (Ag) à Pi(w%)) U Ge. o (Cg) m, 1 (Rt (w^ [y)) à (W* X P; (w?))) 
U (mg (Dp) N (W* \ 173, (Pi Qw* [y)))). 


For further reference, we cite here (in a slightly changed formulation and in our 
notation) Lemma 3.2 of [9]: 


LEMMA 11. Let v*,w* e W*, where v* [y + 1 € P;j(w*[y + 1), for some 
y «a. Then there is a u* € P;(w%) such that u*[y =v" [y. 


LEMMA 12. Let A be a limit ordinal, a = X + 1, w% € W%, w? (A) = 0 and 
E = n5. (Eg), where Eg C WP fora B <i. Then: 


410 | M. MEIER 


(a) f v* € E n) P;(w%), then there is a u* € E N P;(w%) such that 
A — par(u5) Æ A — par(v5). 

(b) If v* € EN zr LOSQU TAN) N (W* \ Pj (w*)), then there is a u* € EN 
T, a (w* [1))  (W* X P; (w*)) such that À — par(w,) XÀ— par(v$). 

(c) If v* € EN (QW? \ zi LG; Qo" [4))), then there is a u* € E N (W% N 
Ty tz (w%[A))) such that X — — par(u5) +À — — par(v$). 


LEMMA 13. Let B be an ordinal, a = (B + 1) + 1, w* € W*, w7 (8 +1) =0 

and E = n5 (Ep), such that Eg C WP. Then: 

(a) If v* € EN P;(w”%), then there is a u^ € E N P;(w%) such that u* (B) a 
vi (B). | 

(b) fv? e EN ua (P;(w* [B 4- 1)) 0 (W* X Pj (w9»), then there is a u* € 
En T Alo (Pi (w* [B -- 1)) à (W* X Pi (w9)) such that u$ (B) Ds v; (B). 

(c) Fv € EQ (W*? E^" (Pi (wo? [B + 1))), then there is au? € EN (W?'N 
75 1,4 (Pr QU" [B + 1))) such that u$ (B) + v (B). 


LEMMA 14. Let À be a limit ordinal, a =à + 1, w% € W%, w? (X) = 0 and 


€ fu 75.1 (Pow(W^)), zc; L(P; (w° »)|. P; i) 


peA 
Then: 


(a) If v* € E, then there is a u* € E such that X — par(u^) X À-— par(v5). 

(b) If v* e W* X E, then there is a u% € W* N E such that X — par(u5) = 
A — par(v$). 

(c) If E 2 [X — par(X5) Sh par(w5)], then E = WF. 

(d) If E C [A — par(X5) =A — par(w%)], then E = Ø. 


LEMMA 15. Let B be an ordinal, a = (B + 1) +1, w® €e W*, w? (B +1) =0 
and 


E € [rg s Pow (r^). 71541 4 (Pr Qi? [B + 1))], Pj (w?)]. 
Then: 


(a) If v* € E, then there is a u* € E such that u% (B) + v? (B). 

(b) If v? € W* X E, then there is a u* € W* X E such that us (B) #4 v5 (B). 
(c) If E 2 [X7 (B) = w5(B)], then E = W*. 

(d) If E C [X3 (5) = w3 ()], then E =Ø. 


FINITELY ADDITIVE BELIEFS 411 


LEMMA 16. Let y bean ordinal > 0, a = y + 1, w? € W? and 


Ec [U Tga l (Pow(WÊY), Ty, 1 (P; (w^ »»| 


B«y 
such that E > P;(w*). Then 


E D n} (Piw fy). 


LEMMA 17. Let 2 be a limit ordinal, w^ € W^. B <i and E? CWP such 
that 51 (EP) > Pj (w^). Then n5 (EP) 2 51, ,CPi (w^ [B + D). 


PROOF OF THEOREM 2. Weconstruct the 7;'s by transfinite induction on 1 < 
a < k, such that the conditions 1—7 of the Induction hypothesis at the beginning of 
this section are satisfied: 


Stepa —1. We have W? = (h, t). Let w! e W!. Define 


1, — ifwi(021, 


«1 ,,,l S — 
Ti^ (w°)([Xo = wop = | 5, if w}(0)=0, 


T; (wH X} x wo) : 1 — T; Qo (XQ = wo), 
T;<'(w')(@) :=0, 
T; (w) (W!) :=1. 
It is clear that by this definition, 1 (w!) is a probability measure on 
(W!, 791 (Pow(W9))). 
Let E? c W° such that P;(w!) € 15 1 (59). 
1. Case: w} (0) = 1. Then E? = W° or E? = [X9 = wo], hence the outer measure 
T,*1 (w!)* (P; (w!)) is equal to 1. 


2. Case: wi (0) — 0. Then E? — W? and the outer measure T, wy*(Pi (wl) is 
equal to 1. 


For u! € P;(w'), we have in both cases that P;(u!) = P; (w!) and Tj?! (u!) = 
T;<1 (w1). For each P; (u!) such that u! € W1, choose a representing element w! € 
P;(u!) = P;(w!). By the Lo$-Marczewski theorem, we can extend pot (wl) toa 
finitely additive probability measure T (wl) on the field 


F (i, w!) = [ng | (Pow(W9)), P; (w!)] 


such that 77! (w') (P; (w!)) = 1. Define 7;! (u!) :— T} (wt), for all u! € P; (w!). 
Note that F (i, ut) = F (i, w!), for u! € Pj(w!). T} (u!) and FG, u!) satisfy 
the conditions 1—7 of the Induction hypothesis. 


412 M. MEIER 


Step a = (B--1) 4-1, for0 x B «x. For each P;(u%) such that u* e W*, 
choose a representing element w* € P;(u*) = P;(u?). 
Let T;^" (w^) be the finitely additive probability measure defined on the field 


Tar talF (i, WTE + 1)) = [15,4 (Pow(QW^)), wei) ,(PrQw" [B + 1))], 


which is induced by T? n (w*[B + 1) (as defined in Lenia 1). According 
to Lemma 16 and the Induction hypothesis, we have for the outer measure of 
Pj (w?) : T,^* (w*)* (P; (w?)) = 1. So, by the Lo$-Marczewski theorem, we can 
extend T (w* to a finitely additive probability measure T^ (w*) defined on the 
field 


F (i, w*) = [[mg Pow(W), x51, (Piw [B + 1)]; Pi(w)) 
such that Te (w9*)(Pj (w9)) = 1. 
]. Case: w? (B + 1) = 1. Then 


Tio 0X, (8) = (w3 TB + 1)(B)]) = [XF (8) = w3 (8)] 2 Pi(w?). 


By Lemma 3, extend Tf (w^) to a finitely additive probability measure T” (w^) 
on the field 


[5], ,(Pow(W*5), P;(w*)] = FG, w°). 
By the above, we have 
T? (w*)(IX9 (B) = w* ()]) = 1. 
Define now 77? (u^) :— T? (w^), for all u” € Pj(w?). Note that, for u” € 
Pi (w9), we have F (i, u*) = F (i, w*), F (i, u*) = F (i, w*) and u? (P +1) = 
w; (B + 1) = 1 and hence, us (B) = w^ (B). It is now easy to check that T7 (u^) 
and ¥ (i, u^) satisfy the conditions 7 of the Induction hypothesis. 
2. Case: w? (B + 1) = 0. By Lemma 15 and the Induction hypothesis, we have for 
the outer measure of [X5 (B) — w^ (8)]: 
T? (w*)* (LX* (B) = w5(8)]) = 1, 
and for the inner measure 


T? (w) (LXS (8) = w? (8)]) = 


By the Eo$-Marczewski theorem, we can extend T? (w%) to a finitely additive 
probability measure Ts (w**) on the field 


[F (i, w“), LX? (8) = w? (8)1] 
such that 
T? (w*)(E[X9 (8) =w (8)]) = 5. 


FINITELY ADDITIVE BELIEFS 413 


Finally, by Lemma 3, extend T? (w*) to a finitely additive probability measure 
T? (w^) on F (i, w^). Define now T” (u^) := T” (w^), for all u” € P; (w^). It 
is easy to check that T" (u^) and ¥ (i, u“) satisfy the conditions 1-7 of the 
Induction hypothesis. 


Step a = À, A limit ordinal. For each P;(u%) such that u^ € Ws choose a 
representing element w“ € P; (u*) = P;(w®). 
Let T;*^" (w^) be the finitely additive probability measure defined on the field 


U aah (FG w*r8)) = L] zp @ow w4)) 
B<a B«« 
which is induced by (TË (w" [8))1«g«o (as defined in Lemma 1). 
Let B <a and E? C WP such that LP (EP) > P;(w%). By Lemma 17, we have 


Tp E^) 2 p41 V (Pi w^ [B + D). 
(Note that B < o implies B + 1 < a.) Since, by the definition of T;^? (w?), 


T, (w®)(ag hy (Pru TB + D)) = TF (w [B + D(PrQo? [B + 1) = 1, 


the outer measure 7;^^ (w^)* (P; (w^)) is equal to 1. By the Lo$-Marczewski the- 
orem, we can extend 7,7" (w^) to a finitely additive probability measure 77" (w^) 
on the field 


F (i, w®) = | U zz Pow(w*)), P; e) 
B<a | 

such that 1 (w®)(P; (w) = 1. For u^ € Pj (w^) define T7? (u^) :— T” (w^). It is 

easy to check that 77" (u^) and F (i, u”) satisfy the conditions 1—7 of the Induction 

hypothesis. 


Step a — À + 1, A limit ordinal. For each Pj(u*) such that u* € W*, choose a 
representing element w^ € P; (u^) = P;(w*). 
Let T,^? (w?) be the finitely additive proni measure defined on the field 


Um, La (f (i, w? [A)) = [Us Tg 1 (Pow(W^ )), Hy. r (P; py) 


p«A 


which is induced by 77 (w^ [A) (as defined in Lemma 1). According to Lemma 16 
and the Induction hypothesis, we have for the outer measure of P;(w%): 
T,<* (w*)* (P; (w*)) = 1. So, by the Lo$-Marczewski theorem, we can extend 
T<% (w%) to a finitely additive probability measure Te (w%) defined on the field 


$ (i, w*) = lu 75 ,(Pow(WP)), 13-5 (Pi(w® ny) w^) 


B<r 
such that T? (wW(P;:(w®)) = 1. 


414 M. MEIER 
1. Case: w? (A) = 1. Then 
my (D. — par(X4) =A — par(w? [A)]) = [A — par(X7) = A — par(w*)] 
2 Pi(w^). 
By Lemma 3, extend T? (w®) to a finitely additive probability measure 77" (w^) 
on the field 
[75.1 (Pow (W^), P;(w*)] = F (i, w®). 
By the above, we have 
T? (w^)([A — par(X5) = X — par(w7)]) = 1. 
Define now T7" (u^) := T7? (w^), for all u“ € P;(w*). [Note that ¥ (i, u”) = 
F (i, w”), ur (A) = wj (A) = 1 and hence, A — par(w7) = A — par(u5).] It is 
now easy to check that T7" (u^) and F (i, u”) satisfy the conditions 17 of the 
Induction hypothesis. 
2. Case: w? (X) = 0. By Lemma 14, we have for the outer measure of [A — 
par(X7) =A — par(w5)]: 
T? (w** (D. — par(X9) =A — par(w*)]) = 1, 
and for the inner measure 
T? (w^), (D. — par(X9) = À — par(w?)]) = 0. 


By the Los—Marczewski theorem, we can extend Ta (w%) to a finitely additive 
probability measure T4 (w^) on the field 


[F (i, w*), [A — par(X%) = A — par(w^)]] 
such that 
T"(w%)([A = par(X57) =À — par(wj)]) = H 


Finally, by Lemma 3, extend To (w) to a finitely additive probability measure 
T? (w*) on F (i, w?). Define now 77" (u^) :- T7 (w^), for all u” € Pj (w?). It 
is easy to check that T” (u^) and F (i, 4?) satisfy the conditions 1—7 of the 
Induction hypothesis. 


Remaining step. To finish the proof of Theorem 2, we have to extend T; (u*) 
to a finitely additive probability measure 7; (u^) defined on the field Pow(W*) 
such that T; (u*) = T; (w*), for u* € P;(w*). 

By the inductive construction, T“ (u^) is defined on 


U 251 (Pow(WP)), n^ 


pek 


r> 


FINITELY ADDITIVE BELIEFS 415 
such that the conditions 1-7 of the Induction hypothesis are satisfied for o = x. 
For each P; (u*) such that u“ € W*, choose a representing element 
w* € P;(u“) = P;(u*). 
By Lemma 3, extend T“ (w^) to a finitely additive probability measure T; (w^) 
on the field Pow(W*) and define 


Ti (u*) :— Ti (w^), 


for u* € P; (w*). By construction and the Induction hypothesis for a = x, the func- 
tion 7; : W* — A(W*,Pow(W*)) has all the desired properties, and hence Theo- 
rem 2is proved. O 


6.1. Proofs of Lemmas 10 and 12-17. 


PROOF OF LEMMA 10. By the definition of 


| U T3 a (Pow(W^)), T, (Pi (w" »)| Ru") 


p<y 
E has the form 


E = (ng, (Ag) y (P QU" Ty))) 
U (ra (Bn) N (W* Ns (Piw fy))))) n Pi Qu^): 
U (((stg (Ce) n (Pi Qi Ty) 

U (nz 4 CDi) N (W* \ 274 (Pi Qo 1))))) n (W* V P; (w))), 

where B, 7, £, t < y and Ag C WP, B, CW, Ce C WË, De CWS. 
The lemma follows from the following facts: If n < £, then c, 2 (By) € we 

and T, : (Bp) = Tp a CE (B,)), so we can assume without loss of generality that 
Bp=n=&=¢.By Remark 8, we have P; (w%) G zt; 1 (P; (w*[y). LJ 


PROOF OF LEMMA 12. Let v* e E. Since f, o^(v?)), o^ (v?) < À, there is an 


ordinal £ such that max{f, o^ (v?)), o^ (v?)) < « É <A and "S that the parity of 
E+ Í is different from A — par(v*). Define now u“ e W? by 


uo := Vo, 
io 
uy) :— viv) for all y < « with y Zé, 
us (£) :— 1. 
It follows that A —par(u$) 7: A — par(v$) and u^ [B = v*[B, which implies u* € E. 


416 M. MEIER 

(a) If ve Pi(w"), then it is easy to check that u* € P;(v*) = P;(w9). 

(b) If v* e 75 i (Pi (o [A)) N (W* \ Pji(w?*)), then it follows that 
vem "467 (w*[A)) and v?(A) = 1. It is again easy to check that u” € 


Hy (P; (v* [A)) = n; LP (w^ [X)) and since 47 (A) = 1, we have u” € (W* X 
Pi(w^)). 
(c) If v^ g Hy l (P; (w* [4)), then there are four cases: 


1. v? [AA w? [4. From us [A = v? [A it follows that u“ d: Umi 1 (P; (w? [A)). 
2. There i is a y <à such that 


(vj [3)(y +1) = Qo Ay -- D 21. and. Qf 26) z (wh Ay). 


Since y + 1 <A and max{8, o^(v?), o^(v?)) > y + 1, it follows that 
u* [y 4-2 — v* [y +2 and therefore u^ ¢ x (B; (w* [A)). 
3. There is a limit ordinal À « A such that 


(v? [4)) = Qu TAQ) — 1. and X— par(v?) AA — par(w^). 
We have £ > i, and therefore 
(uf [A)(A) — Qf [4)0) and X — par(u?) —À — par(v?). 
It follows that u” ¢ 7} (Pj(w*[A)). | 
4. (v? [4)(0) = (w7 JA) (0) = = 1 and vo Æ wo. We have 
(uf [A)(0) = (v  [A4)(0) 21 and uo= vo. 
It follows that u” ¢ zr; L(Pj(w^[A)). O 


PROOF OF LEMMA 13. Letv” c E. Define u^ e W? by 
HQ -= UO, 
us (y) = v5 Cy) for all y < o with y zx f, 
u$ (B) := 1 — v$ (8). 
It follows that us (B) Æ Vj (B) and u^" [8 = v" [B, which implies u^ c E. 


(a) If v* e P;(w%), then it is easy to check that u” € P;(v%) = P;(w?). 

(b) If v* € x51, ,(Pr(w? [B + 1)) N (W* X Pi (w*)), then it follows that v* € 
7511 o (Pi (we [B + 1)) and v? (B + 1) = 1. It is again easy to check that u^ € 
tat, (Pj [B--1) 5 zz, (P;(w*[B 4- 1)) and since u* (B +1) = 1, we have 

B+l,a B+ti,a i 
u* € (W* X P;(w®)). 
(c) If v" éng HR «(Pi QW? [B + 1)), then there are four cases: 


FINITELY ADDITIVE BELIEFS 417 


1. v*[B -- 1 Z w? [B +1. From uf [B + 1 — v? [B + 1, it follows that 
U É zl, g(Pi(w*[B + 1)). 
2. There isa y < B such that 
GIBT DQ +1) -Q[BT DG -D2z1 
and 
WTE + 1)(y) # (v? [B + Dy). 
By the definition of u^, (u7 [B + 1)(y + 1) = 1 and, since y < f, 
(TB + (y) = QT + DY), 
hence u^ ¢ 75), ,(Pr(w^ [B + 1). 
3. There is a limit ordinal A < 6 + 1 such that 
(TB--1)0) = (uf [8--1)0) —1 and A-—par(v?) 4 A — par(w?). 
We have (u¥ [B 4- 1) (4) = 1 and, since A < B, —par(u5) =A par(v5). 
It follows that u* ¢ z:5 1, ,(Pi(w* [B + 1)). 
4. (v? [B +1)(0)  Qw? [B + 1)(0) = 1 and vo Æ wo. We have 
(uj [B + D0) = GP T8 4-1)(0 Z1 and uo= wo. 
It follows that u” ¢ Tapia h (w*[B8--1). O 
PROOF OF LEMMA 14. The first point of the lemma follows from Lemmas 


10 and 12. 
The second point follows from the first and the fact that 


| J A (Pow(W^)), a" (Pi (w nm) P; w^ 


B<id 


is a field on W” (and therefore it is closed under complements). 
The last two points of the lemma follow directly from the first two points. [O 


PROOF OF LEMMA 15. Note that if a > B + 1, then 
LJ zzi (Pow(W5)) = 25 5,(Pow(W*)). 
£«f-4-1 


The proof is now analogous to the proof of Lemma 14—just replace A by 8 + 1 
and Lemma 12 by Lemma 13. OU 


PROOF OF LEMMA 16. Since P;(w%) C Ln, : (Pi w^ | y )), it follows from the 
definition of 


| U zzi @ow(w?)), zl (P; (w* m| 


B«y 


418 M. MEIER 


that there is a B < y and an EP C W such m 
E Nnr, (Pi(w*[y)) = x5 (P) N nz 3 (P: (w [y)) 2 Pj(w?). 


Claim. x5 (EP) 2 n, 1 (Pi (w" [y)). 


Assume to the EE. that there is a v^ € Ty, l(P, (w*Ty)) \ Th ic (EP). Since 
B+1<y, we have v^[B -- 1 e Pj(w^[B + 1). By Lemma 11, there 4 isa u* € 
P; (w*) such that u*[B = v*[B. Since EP C WP and v? éz =| (Eh), it follows 


that u^ £ LP 1 (EP), a contradiction to Tg y a (EP) D P; (w®). O 


PROOF OF LEMMA 17. Assume that there is a v^ emg, ,(Pi(w*fB + 1))\ 
7 ; 1 (EP). By Lemma 11, there is a u^ € P;(w*) such that u^ [B = v*[f. Therefore 
u^ £ mp (EF), a contradiction to 7g. APIS 2 P(w). O 


APPENDIX 


PROOF OF LEMMA 7. We prove the lemma by induction on the formation of 
k expressions. 


(a) Let o = E, where E € 2, ; = Pow({h, t}), and let w*, u* € W* such that 
w“ [0 = u“ [0. By definition, v“ e EW” iff 6(v*) € E, for v“ e W“. But we have 
0 (v*) = vo = v“ JO, for v“ € W“. It follows that u“ € EW“ iff w" € EV, 

(b) Let o = —v such that depth(g) < o and let w“, u* € W* such that w* [a = 
u“ [o.. It follows that depth(/) <a and u* € oV iff u* d yW“ iff—by the induc- 
tion assumption—w* ¢ yW“, which is the case iff w“ € o" 

(c) Let p € [0, 1], į ela, b}, 9 = BP (y), dp(w)+1=8+1<o and w“, u“ € 
W* such that w“ [o = u*[o. By the induction assumption, there is a EP c W? 
such that yW“ = =p 71 (EP). By Theorem 2 and Remark 11, 


T; (u*) (p  ) = Tw) (vw). 


It follows that u” € (B'(y))V iff w* € (BP Y". 

(d) Let |F| < x, p= Ayew V such that depth(g) < œ, and let w*,u* e W“ 
such that w“ [a = u“ [a. Then depth(w) < o, for y € V. By the induction assump- 
tion, u“ e V V" iff w* € VV", for y e Y. It follows that u” € oW“ iff w* eo” 


[1 
Lemma 8 follows directly from Theorem 2 and Remark 11. 


PROOF OF LEMMA 9. The first point is clear. We show the second and the 
third points by a transfinite induction on 0 < f <x. 


“ys 


FINITELY ADDITIVE BELIEFS 419 


(a) B =0. According to Lemma 8 and the first point of this lemma, we have 
(y} 0)" = (Blah v BH)" = (XP =H 
and 
(9$ (0) = (~B (A V Bi 0)" - (Xf @ 01. 
And obviously, dp(o? (0)) — dp(} (0)) = 1. 
I p=y + 1. According to the induction assumption, there are x-expressions 
9; QU (y) and g! j 1 (y) such that 
(p(y) = [X¥(y) =O], 
(oly) =[X*(y) — 11, 
dpl? (y)) = dp(gj Q^) 
= p. 
Define 
pi (B) :— B] (9 (y)) v Bj (91 y) 
and 
9$ (B) := 797 (B). 
pł (B) and qp (B) are «-expressions of depth 6 + 1. We have 
[X7 (8) = 0] = W* \ [X (8) = 1]. 
By Lemma 8 and the induction assumption, it follows that 
[X (8) = 11 = (ej (9) 
and 
[Xf (8) —0] = (p (8) ". 
(c) Let A < x be a limit ordinal. For i € (a, b) and B < A define in W“: 
D^ (8)]:- IXF() 130. A) IXf (0) =], 


Ba «A 
[Z}]:= f] [Xf (e) 2 01. 
0a «A 
According to the induction assumption for a, B < X and the fact that |À] < x, it 
follows that 


WB —eB)^ A vo) 


B<a<i 


420 M. MEIER 


and 
xt: A ea) 
0xa «A 
are x -expressions such that 
dp(V7 (8)) = max(8 + 1, sup(dp(gf ())]8 <a <A}} 
= sup{@ + lla <A} 
=À, 


and, similarly, dp( x?) EA. 
It follows from the induction assumption that 


[Y^ (8)] = (28) and. [ZH 5 (xj). 
Since o^ (w; ), for w € Wi, can never be a limit ordinal, we have 
[A —par(X7)=even]=[Z}]U |) [Y?(8)1 
p «A,p odd 


and 


[A—par(X?) —odd]- |} [YÀ(8)). 


P «X, even 


Again, since |À] < x, it follows from the above that 


9e; Q):x^v V WB) 


B «^, odd 


and 


990): WV wh) 


B «A, even 


are «expressions such that 


dp(y;"""(A)) = max(dp(x?), sup(dp(V? (8))18 <A, Bodd}} =A, 


and 
dpo aA) — 4. 
By the definitions and the induction assumption, we have 


(o£ ^ (ay) = [A — par(X*) = even] 


I 


and 


(9995 (4)) = [A — par(X*) = odd]. 


dk 


FINITELY ADDITIVE BELIEFS 421 


(d) B =A, A limit ordinal < x. By Lemma 8 and the above we have 


- (9109) = (BEF) v BEAD)” = [Xf Q) =, 


(PADY :— (BLETA) v BH o? 09)" = IXFQ) = 0] 
and 
dp(B} (o? Q)) v Bl way)” 


= dp(^(Blg? ^ Q)) v BEA)“ 
—A-4 1. E 


Acknowledgments. I would like to thank an anonymous referee for very valu- 
able suggestions that improved the readability of the paper a lot. Helpful comments 
of Aviad Heifetz, Jean-François Mertens and Dov Samet are gratefully acknowl- 
edged. 


REFERENCES 


[1] AUMANN, R. J. and BRANDENBURGER, A. (1995). Epistemic conditions for Nash equilib- 
rium. Econometrica 63 1161—1180. MR1348517 
[2] BATTIGALLI, P. and SINISCALCHI, M. (1999). Interactive beliefs and forward induction. 
Working Paper ECO 99/15, European Univ. Institute. 
[3] BRANDENBURGER, A. and DEKEL, E. (1993). Hierarchies of beliefs and common knowledge. 
J. Econom. Theory 59 189-198. MR1211557 
[4] DEVLIN, K. (1993). The Joy of Sets, 2nd ed. Springer, New York. MR1237397 
[5] HARSANYI, J. C. (1967/68). Games with incomplete information played by Bayesian players, 
I-II. Management Sci. 14 159—182, 320—334, 486—502. MR246649 
[6] HEIFETZ, A. (1993). The Bayesian formulation of incomplete information—the noncompact 
case. Internat. J. Game Theory 21 329—338. MR1222760 
[7] HEIFETZ, A. (2002). Limitations of the syntactic approach. In Handbook of Game Theory 3 
(R. J. Aumann and S. Hart, eds.) 1682-1684. North-Holland, Amsterdam. 
[8] HEIFETZ, A. and MONGIN, P. (2001). Probability logic for type spaces. Games Econom. Be- 
hav. 35 31-53. MR1822464 
[9] HEIFETZ, A. and SAMET, D. (1998). Knowledge spaces with arbitrarily high rank. Games 
Econom. Behav. 22 260—273. MR1610077 
[10] HEIFETZ, A. and SAMET, D. (1998). Topology-free typology of beliefs. J. Econom. Theory 82 
324—341. MR1662246 
[11] HORN, A. and TARSKI, A. (1948). Measures in Boolean algebras. Trans. Amer. Math. Soc. 64 
467—497. MR28922 
[12] Lo$, J. and MARCZEWSKI, E. (1949). Extensions of measure. Fund. Math. 36 267—276. 
MR35327 
[13] MERTENS, J. F., SORIN, S. and ZAMIR, S. (1994). Repeated games. Part A. Background 
material. CORE Discussion Paper 9420, Univ. Catholique de Louvain. 
[14] MERTENS, J. F. and ZAMIR, S. (1985). Formulation of Bayesian analysis for games with 
incomplete information. Internat. J. Game Theory 14 1-29. MR784702 
[15] SAVAGE, L. J. (1954, 1972). The Foundations of Statistics. New York, Wiley. [Second edition 
(1972) Dover, New York.] MR348870 


422 M. MEIER 


[16] STALNAKER, R. (1998). Belief revision in games: Forward and backward induction. Math. 
Social. Sci. 36 31-56. MR1636254 


INSTITUTO DE ANÁLISIS ECONÓMICO, CSIC 
CAMPUS UAB 

08193 BELLATERRA, BARCELONA 

SPAIN 

E-MAIL: Martin.Meier@uab.es 


The Annals of Probability 

2006, Vol. 34, No. 1, 423-426 

DOI: 10.1214/009117905000000503 

© Institute of Mathematical Statistics, 2006 


CORRECTION 


IMPROPER REGULAR CONDITIONAL 
DISTRIBUTIONS 


BY TEDDY SEIDENFELD, MARK J. SCHERVISH AND JOSEPH B. KADANE 


Carnegie Mellon University 


A strict inequality appears in Definition 6 where a weak inequality is needed. 
We reproduce Definition 6 here. 


DEFINITION 6. Fix o and consider those A such that œ € A € A. If for some 
Q9 € A € A, P(A|A)(m) = 0, say that P(A) is maximally improper at c. Oth- 
erwise, if for each w € A € A, 1 > P(A|%)(@) > 0, say that the rcd is modestly 
proper at w. 


At the bottom of page 1614, we are not precise in the definition of a Borel space. 
The condition should have read that there is a one-to-one measurable function with 
measurable inverse between (2, 8) and (E, €), where E is a Borel subset of the 
reals and & is the Borel o-field of subsets of E. After the remaining corrections 
below, our use of the term “Borel space" conforms with this definition. 

Some conditions were left out of Theorem 4 and Lemma 3. The proof of 
‘Lemma 3 also had some errors that made it almost impossible to follow. Finally, 
the proof of Theorem 4 was said to be straightforward from Theorem 3. We in- 
clude here the restatements of both results with the missing conditions, the revised 
proof of Lemma 3, and a proof of Lemma 4. The only application of Lemma 4 
given in the original paper is to the proof of Corollary 2. The additional conditions 
given here are satisfied in that case. 


THEOREM 4. Assume that A is an atomic sub-a -field of B. Let (©, D) bea 
Borel space, with a probability measure u. For each 0 € ©, let Pg be a probability 
on B such that for every B € B, Po(B) is a D-measurable function of 0. Let 
P (-) be defined on B by P(-) = fo PoC) du(0). Assume that, for -almost all 0, 
Po (-|94) is a maximally improper rcd for Pg and that it is A & D-measurable as a 
function of (w, 0). Also, assume that the set 


B* = {(@, 0) : Pa (|A) is maximally improper at œ}, 
is in A Q D. Then there is a maximally improper version of P(-|A). 


Received October 2004; revised April 2005. 
423 


424 T. SEIDENFELD, M. J. SCHERVISH AND J. B. KADANE 


LEMMA 3. Let (©, D) be a Borel space, with a probability measure u. For 
each 0 € ©, let Ps be a probability on B such that for every B € B, P9(B) 
is a D-measurable function of 0. Define the probability P on B by P(B) = 
Jo Po(B) du(0). Let A be a sub-o-field of B. Also, let Pa(-|.4) denote an rcd 
for each Ps that is A & D-measurable as a function of (w, 0). Then, for each œ 
there exists a probability v, on D such that for all B € B 


(1) J, P (BJA) (%0) dvo (0) 


is a version of P(B|A). Also, these versions form an rcd. 


PROOF. Let & be the product o-field B Q D. For each E € &, define 
= (0:(0,08) e E), 
E? = (8 :(v,0) € E}, 


the 0- and w-sections of E. Standard arguments like those of Billingsley ([1], Sec- 
tion 18) allow us to conclude that Eo € B for all 0, and Po ( Eg) is a D-measurable 
function of 0. Define 


Q(E) — Í. Po(Eo) du) 


which is easily seen to be a probability on €. Let 71 (w, 0) = œw and m2(w, 0) = 0 be 
the oe projections, which are € -measurable. Let Al = np (A) and D’ = 
Jy D), which are sub-o -fields of €. Every 4’-measurable fn Ros must be an 
-4-measurable function of 71. Because (©, D) is a Borel space, there exists an 
rcd for. zr? given A’ relative to Q, Q( |A’). We will denote QGr;- MD) A’ )(o, 0) 
by v4 (D). In similar fashion to the arguments earlier in the proof, v; (E^) is 
-4-measurable as a function of o for all E € €. Define 


Qo(E) = J vw (E?) d P (o). 
For each A € A and DED, we have 
QA x D) = | Iavo(D) dP (w) = Q(A x D). 


It follows that Qo = Q on all of A Q D. 

For each o, (1) is a probability. We need to show that it is -measurable as 
a function of œ. We have assumed that Po(-|)(@) is “A & D measurable, so 
we can approximate it from below by a sequence ($&)7- , of nonnegative sim- 
ple functions. In similar fashion to the argument at the beginning of this proof, 
vo (E9) is -4-measurable for all E € A & D. It follows that f $4(c, 0) dv, (0) is 
“”A-measurable for each n, and (1) is a limit of A-measurable functions. 


CORRECTION 425 


To complete the proof, we show that, for each A € “A and B € B; the integral 
of (1) over A equals P(A N B): 


| [ [mio co avs) a P) = [ Taw) Po(BiA)(@) d Qoo. 0) 
AJO 
= J I4 (@) P&(B|AA) (02) d O(a, 0) 
= | | IA (c) P5 (B |-A) (c) d Po (w) du(9) 


= J P (AN B)du(0) = P(AN B), 


. where the first equality is from the definition of Qo, the second follows from the 
fact that Qo = Q on A & D, the third is from the definition of Q, the fourth is 
from the definition of Po (-|.4) and the last is the meaning of Pg. L 


PROOF OF THEOREM 4. Because A is atomic, Pg(-|4) is maximally im- 
proper at œw if and only if Pe(a(o)|-&)(o) = 0, where a(o) is the A-atom con- 
taining w. Hence, we can rewrite the set B* as 


B" = {(@, 0) : Po(a(w)|4)(@) = 0], 
whose 6-sections satisfy 
Ba = (o: Pa lalo) lA) (w) = 0} e B. 


For each 0 such that Po(-|.) is maximally improper, Bj has inner Pg measure 1. 
Hence Po(B;) = 1, ae. [u]. By standard arguments, P9(B5) is D-measurable, 
and it follows that 


Q(B*) = J. Po (Bx) du(6) =1, 


where Q was constructed in the proof of Lemma 3. 
Similarly, the w-sections of B* satisfy 
B*? = {0 : Po(a(w)|A)(w) =O} € D. 
For each o, let ve be the measure from Lemma 3. Then v, (B*?) is O-measurable. 
Since B* € A & D, we have 
1 = Q(B") = Q0(B*) = | v (5) dP), 
Q 


where Qo was constructed in the proof of Lemma 3. So, there is a set C € B with 
P (C) = 1 and forall w € C, v4 (B*?) = 1. It follows that, for each o € C, there is a 
set E(w) € D with v4 (E(o)) = 1 such that Pg (a(w)|.A)(w) = 0 for all 0 € E(o). 
Let P(-|.A) be the version guaranteed by Lemma 3. Then, for each w € C, 


P (a(o)|A)(2) = I. Po (a(w) |) () dvo (8) — 0. 
This means that P(-|.4) is maximally improper. O 


426 T. SEIDENFELD, M. J. SCHERVISH AND J. B. KADANE 


Acknowledgments. The authors would like to thank P. Berti and P. Rigo both 
for bringing these points to their attention and for helping to patch the proofs. They 
would also like to thank a referee of the correction for an extremely careful reading 
that further helped correct and simplify the proofs. 


REFERENCE 
[1] BILLINGSLEY, P. (1995). Probability and Measure, 3rd ed. Wiley, New York. MR1324786 
T. SEIDENFELD M. J. SCHERVISH 
DEPARTMENT OF PHILOSOPHY J.B. KADANE 
CARNEGIE MELLON UNIVERSITY DEPARTMENT OF STATISTICS 
PITTSBURGH, PENNSYLVANIA 15213 CARNEGIE MELLON UNIVERSITY 
USA PITTSBURGH, PENNSYLVANIA 15213 
E-MAIL: teddy @stat.cmu.edu USA 


E-MAIL: mark @stat.cmu.edu 
kadane Q stat.cmu.edu 


The Annals of Probability 

2006, Vol. 34, No. 1, 427-428 

DOT: 10.1214/009117905000000521 

© Institute of Mathematical Statistics, 2006 


CORRECTION 


CENTRAL LIMIT THEOREMS FOR ADDITIVE FUNCTIONALS OF 
THE SIMPLE EXCLUSION PROCESS 


BY S. SETHURAMAN 


Iowa State University 


Definition 2.1 in the above paper is incorrectly stated. In the proof of Theo- 
rem 2.1, which gives an invariance principle for certain processes satisfying Defi- 
nition 2.1, conditions in Definition 2.1 are sufficient to deduce finite-dimensional 
convergence, but not enough to apply a maximal inequality for “demimartingales” 
to obtain tightness. The problem is Definition 2.1, as stated, only considers “pair 
increment associations" and not more general associations needed for the demi- 
martingale property. We slightly strengthen the definition here in this correction 
so that the proof of tightness in Theorem 2.1 holds. Details of how this is accom- 
plished are given below. 

By substituting the corrected Definition 2.1 for tbe previous one, all results in 
the article hold as written. In particular, Proposition 2.1, which is the link between 
Theorem 2.1 and the main results, and which states certain additive processes sat- 
isfy Definition 2.1, holds with the same argument. 


CORRECTED DEFINITION 2.1. Let (v(£) = (v1 (t), ..., Um(t)) :t > 0} be an 
m-dimensional L? process with stationary increments. We say V has weakly posi- 
tive associated increments if 


E|o(¥(t +5) — v(s)v (VG. .... vGa))] 
> Elo) JEW (v). .... ¥n))| 
for all coordinatewise increasing functions $:IR" — R and w:(R™)" — R, and 


all s,t 0,05 <° <S =s andn 1. 


We remark the earlier Definition 2.1 only stipulated the pair condition 


E|ó(v(t +5) — v()) V (GG)] = EI GG)1ED/ ((5))]. 


We now indicate how the modified definition applies in the proof of tight- 
ness in Theorem 2.1. Following standard tightness arguments, one needs to prove 


Received May 2005; revised June 2005. 
427 


428 CORRECTION 


for a continuous mean-zero scalar process v(t) with stationary increments sat- 
isfying corrected Definition 2.1, with v(0) = O, lim-o% t ! E[v(t?] = o? and 
t 1^ v(t) = N(0, o°?) that, for all & > 0, 


610 ->o 


1 
(1) lim lim sup ~ P sup |v(at)| > eva = 
ô t€[0,5] 


For ô > 0, let A be a countable dense set of [0, ô], and for n > 1, let A, be a set 
of n points so that Aj, ^ A. Fix also that ô € A and ô € Aj. Then, for a > 1, by 
continuity sup, cro, 5j |v(af)] = sup,eA |v(ar)], and for n large enough 


P| sup (at) > eva < < 2P| su sup |v(at)| > val, 
teAy 

Let now 0 € £4 <- < fg 1 < fn = ô be a labeling of An. From the cor- 
rected definition and mean-zero property E[v(t)] = 0 we have E[(v(atj..1) — ~. 
v(at;))v(v(atj),...,v(ot1))] = 0 for all 1 < j x n — 1 and increasing v, 
and so {u(at):t € Án} is a demimartingale (cf. [1], page 362). Hence, we 
can apply the maximal inequality ([2], Corollary 6) and variance convergence 
lims. o5 (48)- ! E[v(a8)^] = o? to get 


am sup P| sup |v(at)| > eval < < gu Jim |? [iwe E eal)” 


[€ A, 


for a nivea constant , CO. From marginal convergence, limg-+o. P[lv(oó)] > 
(e/2)./a]  Qxo?) V? [5-12 exp(—x?/(207)) dx, and so (1) holds. | 

Also, we note typos: in line 8, page 281, change 2/(z det(o?)) to 1/ (rr x 
 (det(o7)) ^); in lines 9-10, page 286, ds to dr; in line 10, page 293, = to >; in 


line 29, page 294, change > 0 to < oo; in line 27, page 297, exp(X25 —h—1)(—is) 
should be (47s — 2A) exp(—As). | 


REFERENCES 


[1] NEWMAN, C. M. and WRIGHT, A. L. (1982). Associated random variables and martingale 
inequalities. Z. Wahrsch. Verw. Gebiete 59 361—371. MR0721632 

[2] SETHURAMAN, S. (2000). Central limit theorems for additive functionals of the simple exclu- 
sion process. Ann. Probab. 28 277-302. MR1756006 


DEPARTMENT OF-MATHEMATICS 
IOWA STATE UNIVERSITY 

396 CARVER HALL 

AMES, IOWA 50011 

USA 

E-MAIL: sethuram G iastate.edu 


` 
` 
ï , DES * E 


The Annals of Probability 
Vol. 34 March 2006 


Aviteles 


Asyinpiotie laws foo eomposttiens derived fram transformed subordiagtors 
ALEXANDER GALEN. TIM POS, 


Ohi the siaximom queue ienpbtivin the 


sunennarket model... eee. soa UM MENA d. LEC aR NIST. 


The sive ob corpons m COn PUUN deatesiHetenpor grupir 
Dv KU ABOVA BONIS NILES Ee 


Ds naniicai stabrity of porcolütion for some interüettig paruje 
SU SLOTS UG SC ao LES s eas he ener rox ERIK FL. BROMAN àb 


Pai 
* 
e 
ra 


Sinsularty points for fosi passage percolation oo. 02... 
Greeds fatice animale Geometry and enticablity 0... AED. te STN 
Wiener haos schudiens et Begar stochastic 


1 


N "baia d ae M ' H FN IY + (ov " (o 
ESEON euius SENE dep Weed OT EEE Bee 


| 


willy applications lo stociastic generaliza) HORIK GUU Mis 


k 


Alitriii: ROCK: 


A characterization of inv mdiaitely aiibi 
sured Craaussian PFeCeNse se. NADH AEE ES NB S 
"Theo AE: PETE 3 H oor e prit NS phase : “ . : pe t HR TELE: + ifE[QIYM ` TET 
Phe Shaanon imonnanon oi Gitretiogs ane tbe achditlonae Hmearithis uti 
SPERAN NNRIRCIINLR, NTEEEE S PENER as 
A microscopic model dor Solan s moiung 
UTE ILE POs ics eiie 2 Dare acess eo n AD De EAS 


Rafie shafiles of decks with repeated cards 200.0000... MARK CONG His 


>74 


-f 


22 


D^ v x MESE prd d t 
4 DAS i "mea 
fet sc ^md e o SPESE En Coi eta 
; M». "h "T DENS Ey P ? : 
pue l eran oes 
u «1 1 rz f ay 


FS HER 


x 


ret 





Cu 
A^ 
ina fe 
‘yd 'À 


r 


IME 
Member 
Price 
15515 





€ be i ^ AS 3 eon Beer CF ra A IFRT LR à 
Stephin M. Suider dine : Ung vor. 


Mg cay ge S, pie ET AMARA. $3010 Pog 
faa Wetitcr (ginetss EE oin Baladus (i9 
tT : bo m 
Eher dt ol Cate, tht pokes TE phi cast 
i l ] 1 
oy a Sa hi« hallini i 
t 
PAX bodega ils X ut tikti tet roak [v iof f x 
j 3 Tau 1 z bd T1225 
Inc! hi Apap di UIC ee uoi i ed ded: 


eocetv a du EJEA 


uo M i : | po: 
b orans Sbor anap giai cde 


' 


resalis vistos (ae sin pies: gumias Pais pous 


cpm woe elegant nat poveri ud ih iN c man Pear 


Gntriburer ucide oey aoran te prese 


1 ac tes i bof ui: Ge Yahiya 
ins GE FH ge oC aind tat tin) ud ed pred? i 
Creve Ub sls 

laus: 


i H TUM dee dion f AE aca. 
Viu DV, puit sap g Aff [Cup hedx" 


Order one at http; www mstar org! 


Ch sane Payne yt LES CN AA Cu EIS 


SE tea. 
""rfi-a bE Hie ie TERN 
DEd OF aia DEAT 


CDL. apari ae ae ctore 





7 ` 
N 
! ? ! 
+ [ Y x 
; i j . 
i r EAE 
a i ; ‘i 
z PR E ; 
ELES x7 
Unset 
^d h.. 
i " P 
H i 
£y : 
ru sE A 
H : xr 





Peo pate decka ol ecne on cstiiation ai dne 
x E y cheaper t Irae eee! MAS 
OR SLT dieit [REOOO REN a4 DUUE tO IN C (CY, 
ded Beh uie ELO adig Lass viet ne li à fi. 
ra GilalaiHgsgcQsu HY UPG GIC GHA To aiddtbO, aO Us | 
DooiHBs gem Racks MAG ULG began Wot ua eus ori tle 
a 
Dd aS >. UP SC Ceiba dn. I isltii MARY MaLa M, 


ade acc 
2 Ea TTAR i 
Bots born vstitk 


Gps ise osse dio dork iog die mes ain ral 


ou ab Fisher Ovid doi die ascinptoris val es of 


am kemena and the asvonjvotie arial 
audthablo sraolied che colebraiesd rosul fa hih 


. 


pots at edith the bisher hoand batts intsciv ot 


ef Y : Wd ape gts et eS cal we oaks db xn : 
Sd ea O adu EPERE GP CRIT EI CELE Sts 


* 


age. ^ “ag ea E SPEC 2 b 5 
S upra A fenha ey: 


Saaai Dues acd Subscriptions Office 
DOO Bethesda 44D 20214-3998, USA 


itc quadr Peas et een E 
! OCEVBUS PERE MUTO Lon 


Ie(201; 33C £029 Paris) 0071 5728 real staf dimstat ory 


Ea 





at 


HC ZUOIT ry of ium dom boa 2 





Articles 


ae ANDER 


On the Manun UPS length in (ig 


sur Zu E HIP 


Y 


D SPAR: jsl 


sVsieins aud « 


Greed Janiece 


Wiener A 


CVORIBUD CURRII 2l Lees sse supe We Ma POA 


Koblnogsorns equations m mime dimensions eli 


and trees 


? E acres eth ENE E EATE TA, 
WHR applications fe? sFoehtastu: CIPRO Dur e NARJE SES 
Nico ROCKS NN 
cU 1 
SX vharacion zation ob the mmiintiei iN idis isibile 
agare Gaussian Decesses c.lsiuur. SUN MPBHATIE PSG NR ALIM XNG 
TT ee tds quu ri pa umani dE S. iux dod Ras UASSU inqui rs 
"he Shuünnon iuiformaben ob Hitraions amd the adudtiena? BEREDI n ETT. 
SEGEAN ANKIRCEASER, NIETPTTILSN ODE RU ACEH ASDP! 
A miereseepie model for Siclan s melting 
ini pronlenmtc sss goska T CL pol VUTA NND 
i } 9e py dut ng A E 
vel de a VS x. tili rere died ea POLLS Ln SM ved L ON nh ee P 


Rife shufiles 


ko its nic! 


bility oi 


animals. 


45 lui 


Phe size of coniporotss in eonim EROS E 
INA Rov AKROVS, RONA DAB BESIDE SNC 
[A 


I 
' 
AUD Novem a beisbol by LRESOLME 





SNA AER do dU SK 


funtshbo efaatis 


regla ter some interaetis 
j 


] x wd b^ X Pa A € t r 
SULTS ponts Tor Gist THES agg porcela DUE. S6 a me ara | 


Cioonetty-and eniiedity osos veru de 


ios of brace sha basti 


wl 
eral 


i ANNA! 


E 


courier HH, dili «Me duh onm IE ra n e tienen a e abii - aime o 


[zc 29 i "E 
NEHN GIMO OCJ aA 


1. 
^ Acl 
` ba FI 
Vd PA a 


Vol. 34, No, 2 Mareh 2006 


SIENNA 


coule co tme DIO NR. s 


^ 


Re 
1 
H 
i 


ae 


EDITORIAL STAFF 


EDITOR 
GREGORY F. LAWLER 


ASSOCIATE EDITORS 
ANDREW BARBOUR OLLE HAGGSTROM DANIEL L. OCONE 
ITAI BENJAMINI FRANK DEN HOLLANDER KAVITA RAMANAN 
KRZYSZTOF BURDZY TAKASHI KUMAGAI GENNADY SAMORODNITSKY 
MICHAEL CRANSTON STANISLAW KWAPIEN TIMO SEPPALAINEN 
PERSI DIACONIS WENBO LI ODED SCHRAMM 
BRUCE DRIVER RUSSELL LYONS CRAIG TRACY 
ALISON ETHERIDGE JONATHAN MATTINGLY OFER ZEITOUNI 


CARL MUELLER 


PUBLISHING SERVICES MANAGER 
GERI MATTSON 





MANAGING EDITOR 
MICHAEL PHELAN 


PRODUCTION EDITOR 
PATRICK KELLY 


PAST EDITORS 
THE ANNALS OF MATHEMATICAL STATISTICS 


H. C. CARVER, 1930-1938 WILLIAM KRUSKAL, 1958-1961 

S. S. WILKS, 1938-1949 J. L. HODGES, JR., 1961-1964 

T. W. ANDERSON, 1950-1952 D. L. BURKHOLDER, 1964-1967 

E. L. LEHMANN, 1953-1955 Z. W. BIRNBAUM, 1967-1970 r 
T. E. HARRIS, 1955-1958 INGRAM OLKIN, 1970-1972 


THE ANNALS OF PROBABILITY 


RONALD PYKE, 1972-1975 BURGESS DAVIS, 1991-1993 
PATRICK BILLINGSLEY, 1976-1978 JiM PITMAN, 1994—1996 

R. M. DUDLEY, 1979-1981 S. R. S. VARADHAN, 1996-1999 
HARRY KESTEN, 1982-1984 THOMAS G. KURTZ, 2000-2002. 
THOMAS M. LIGGETT, 1985--1987 STEVEN LALLEY, 2003-2005 


PETER NEY, 1988-1990 


EDITORIAL POLICY 


The main purpose of The Annals of Probability, The Annals of Applied Probability and 
The Annals of Statistics is to publish contributions to the theory of probability and statistics 
and to their applications. The emphasis is on importance and interest; formal novelty and 
mathematical correctness alone are not sufficient. Also appropriate are authoritative expository 
papers and surveys of areas in vigorous development. All papers are refereed. 


NOTICE 


Manuscripts submitted for publication in The Annals of Probability should be submitted electroni- 
cally. Authors should access the Electronic Journal Management System (EJMS) at http://www.e- 
publications.org/ims/submission/. 


IMS ORGANIZATIONAL MEMBERS 


ACADEMIA SINICA 

ARIZONA STATE UNIVERSITY 
AUSTRALIAN NATIONAL UNIVERSITY 
BATH UNIVERSITY 


BATTELLE PACIFIC 
NORTHWEST NATIONAL LABORATORY 


BOWLING GREEN UNIVERSITY 

CARLETON UNIVERSITY 

CENTRUM VOOR WISKUNDE EN INFORMATICA 
CALIFORNIA STATE UNIVERSITY, EAST BAY 


CHALMERS UNIVERSITY OF TECHNOLOGY 
& GOTEBORG UNIVERSITY 


CORNELL UNIVERSITY 

DUKE UNIVERSITY 

EINDHOVEN UNIVERSITY OF TECHNOLOGY 
FIOCRUZ—FUNDACAO OSWALDO CRUZ 
FLORIDA STATE UNIVERSITY 

FU JEN CATHOLIC UNIVERSITY 

HARVARD UNIVERSITY 

HIROSHIMA UNIVERISTY 

INDIAN INSTITUTE OF TECHNOLOGY 
INDIAN STATISTICAL INSTITUTE 

INDIANA UNIVERSITY 

INSTITUTE FOR DEFENSE ANALYSIS 

IOWA STATE UNIVERSITY 

ISTITUTO PER LE APPLICAZIONI DEL CALCOLO 
JOHNS HOPKINS UNIVERSITY 

KANSAS STATE UNIVERSITY 


LONDON SCHOOL OF ECONOMICS 
& POLITICAL SCIENCE 


LUND UNIVERSITY 


MASSACHUSETTS INSTITUTE 
OF TECHNOLOGY 


MATHEMATICAL SCIENCES 
RESEARCH INSTITUTE 


MCGILL UNIVERSITY 
MEDICAL COLLEGE OF WISCONSIN 


MEMORIAL SLOAN KETTERING CANCER CENTER 


MICHIGAN STATE UNIVERSITY 
MINNESOTA STATE UNIVERSITY 
NANZAN UNIVERSITY 

NATIONAL SCIENCE FOUNDATION 
NATIONAL CENTRAL UNIVERSITY 
NATIONAL CHENG KUNG UNIVERSITY 
NATIONAL CHIAO TUNG UNIVERSITY 
NATIONAL SECURITY AGENCY 

NEW MEXICO STATE UNIVERSITY 
NORTH CAROLINA STATE UNIVERSITY 
NORTH DAKOTA STATE UNIVERSITY 
NORTHERN ILLINOIS UNIVERSITY 


NOTTINGHAM TRENT UNIVERSITY 
OREGON STATE UNIVERSITY 
PENNSYLVANIA STATE UNIVERSITY 
PFIZER INC. 

PRINCETON UNIVERSITY 

PURDUE UNIVERSITY 

QUEENS UNIVERSITY 

RICE UNIVERSITY 

ROCKEFELLER UNIVERSITY 
RUTGERS UNIVERSITY 

SIEGEN UNIVERSITY 

SOUTHERN ILLINOIS UNIVERSITY 
STOCKHOLM UNIVERSITY 
TECHNISCHE UNIVERSITAT 

TEXAS A&M UNIVERSITY 

TEXAS TECH UNIVERSITY 


UNITED STATES, DEPARTMENT OF DEFENSE 


UNIVERSIDAD AUTONOMA DE MADRID 
UNIVERSIDADE DE COIMBRA 


UNIVERSITA COMMERCIALE LUIGI BOCCONI 


UNIVERSITA DELGI STUDI DI PADOVA 


UNIVERSITA DELGI STUDI DI ROMA LA SAPIENZA 


UNIVERSITAT BERN 

UNIVERSITAT KARLSRUHE 
UNIVERSITAT MUNSTER 
UNIVERSITAT ZU LUBECK 
UNIVERSITY OF ALBERTA 
UNIVERSITY OF ARIZONA 
UNIVERSITY OF BRITISH COLUMBIA 
UNIVERSITY OF CALGARY 
UNIVERSITY OF CALIFORNIA, IRVINE 


UNIVERSITY OF CALIFORNIA, LOS ANGELES 
UNIVERSITY OF CALIFORNIA, SAN DIEGO 
UNIVERSITY OF CALIFORNIA, SANTA CRUZ 


UNIVERSITY OF CONNECTICUT 
UNIVERSITY OF DENVER 
UNIVERSITY OF EDINBURGH 
UNIVERSITY OF FLORIDA 
UNIVERSITY OF GEORGIA 
UNIVERSITY OF ILLINOIS 
UNIVERSITY OF JOWA 
UNIVERSITY OF MASSACHUSETTS 
UNIVERSITY OF MICHIGAN 
UNIVERSITY OF MINNESOTA 
UNIVERSITY OF MISSISSIPPI 
UNIVERSITY OF MISSOURI 
UNIVERSITY OF MONTREAL 
UNIVERSITY OF NEW BRUNSWICK 


UNIVERSITY OF NEW MEXICO 
UNIVERSITY OF NORTH CAROLINA 
UNIVERSITY OF OREGON 
UNIVERSITY OF OTTAWA 
UNIVERSITY OF OXFORD 
UNIVERSITY OF PENNSYLVANIA 
UNIVERSITY OF PITTSBURGH 
UNIVERSITY OF SOUTH CAROLINA 


UNIVERSITY OF TEXAS, DALLAS 
UNIVERSITY OF TEXAS, HOUSTON 
UNIVERSITY OF VICTORIA 

UNIVERSITY OF WASHINGTON 
UNIVERSITY OF WATERLOO 

VIRGINIA COMMONWEALTH UNIVERSITY 
WAYNE STATE UNIVERSITY 

YORK UNIVERSITY 


THE ANNALS OF PROBABILITY 


INSTRUCTIONS FOR AUTHORS 


Submission of Papers. Papers must be submitted 
electronically. Authors should access the Electronic 
Journal Management System (EJMS) at http://www.e- 
publications.org/ims/submission/. If you are a first time 
user you must complete the registration. You are only re- 
quired to register once. 

After the registration is complete you will have the 
option to submit your manuscript. After completing the 
form you will then upload your PDF file. 


Preparation of Manuscripts. Authors using LaTeX 
should begin their document with 

\documentsclass [llpt,leqno] {article} 
and produce a double-spaced manuscript with the com- 
mand 

\usepackage{setspace} \doublespacing. 

For further information on preparing your manuscript, 
please see http://www.imstat.org/aop/manprep.htm where 
you will find LaTeX support page for IMS publications 
to use the IMS recommended template. ` 


Submission of Reference Papers. Four copies of 
unpublished or not easily available papers cited in 
the manuscript should be submitted with the manuscript. 


Title. The title should be descriptive and as concise 
as is feasible, that is, it should indicate the topic of the 
paper as clearly as possible, but every word in it should 
be pertinent. 


Abbreviated Title. An abbreviated title to be used 
as a running head is also required. This should 
normally not exceed 35 cbaracters. For example, an 
article with the title “The Curvature of a Statisti- 
cal Model, with Applications to Large-Sample Likeli- 
hood Methods,” could have the running head “Curvature 
of Statistical Model" or possibly "Asymptotics of Like- 
lihood Methods" depending on the emphasis to be con- 
veyed. 


Affiliation. Indicate your present institutional affiliation 
as you would like it to appear. 


Summary. Each manuscript is required to contain a 
summary, clearly separated from the rest of the 
paper, which will be printed immediately after the 
title. Its main purpose is to inform the reader quickly 
of the nature and results of the paper; it may also be 
used as an aid in retrieving information. The length 
of a summary will clearly depend on the length and 
difficulty of the paper, but in general it should not 
exceed 150 words. Formulas should be used as spar- 
ingly as possible within the summary. The summary 
shouid not make reference to results or formulas in 
the body of the paper—it should be self-contained. 


Footnotes. Footnotes should not be used, except as 
described under Title Page Footnotes below. Such 
information should be included within the text. 


Title Page Footnotes. Included as a footnote on page 1 
should be the headings: AMS 2000 subject classifica- 
tions. Primary-; secondary-. Key words and phrases. 

The classification numbers representing the primary 
and secondary subjects of the article may be found at 
www.ams.org/msc/. The key words and phrases should 
describe the subject matter of the article; generally they 
should be taken from the body of the paper. 

Acknowledgment of support, grants and contracts 
should also be included in this footnote. 


Figures. Figures are best prepared as separate postscript 
or encapsulated postscript files and should be included 
with the manuscript. 


References. Citations in text should be numbered 
*...using examples shown in [1]..." and the bibliogra- 
phy should be styled to appear as follows: 


(1] LAMPORT, L. (1994). BTEX: A Document Preparation 
System, 2nd ed. Addison-Wesley, Reading, MA. 


[2] CHEN, X. (1999). How often does a Harris recurrent 
Markov chain recur? Ann. Probab. 27 1324-1346. 


Abbrevations for journals should be taken from 
a current issue of Mathematical Reviews or from 
http://www.ams.org/msnhtml/serials.pdf. 


Copyright, Page Charges and Offprints. Page charges 
are $45 per printed page. Payment of some or all 
of the estimated page charges associated with articles is 
strongly encouraged. The editoria] review of articles and 
administration of page charges are completely separate 
activities. Manuscripts are reviewed and accepted prior 
to determining whether page charges will be paid. 

Every corresponding author will receive a pdf file via 
e-mail of the article. You do not need to do anything to re- 
ceive this file, it will happen automatically. Offprints may 
be purchased by using the IMS Offprint Purchase Order 
Form accompanying the galleys. 

Copyright Transfer and Page Charges forms are re- 
quired, Offprint forms are optional. We must have the re- 
quired forms before your article can be published. Note 
the address/fax number for each form on the form itself. 


Galley Proofs. Authors will receive e-mail notification 
when galleys are ready and have the option of either 
downloading a pdf version of the article or having it sent 
by regular mail. Similarly, authors may return corrections 
either by e-mail or regular mail. 


Correspondence. All correspondence with the editor 
must refer to the manuscript number of the paper. This 
number will be sent to the author acknowledging receipt 
of the article. 


The Annals of Probability 

2006, Vol. 34, No. 2, 429-467 

DOK: 10.1214/009117906000000043 

© Institute of Mathematical Statistics, 2006 


THE HYPERBOLIC GEOMETRY OF RANDOM TRANSPOSITIONS! 


By NATHANAEL BERESTYCKI 
University of British Columbia and Ecole Normale Supérieure, Paris 


Turn the set of permutations of n objects into a graph Gn by connecting 
two permutations that differ by one transposition, and let o; be the simple 
random walk on this graph. In a previous paper, Berestycki and Durrett [In 
Discrete Random Walks (2005) 17—26] showed that the limiting behavior of 
the distance from the identity at time cn/2 has a phase transition at c — 1. 
Here we investigate some consequences of this result for the geometry of Gn. 
Our first result can be interpreted as a breakdown for the Gromov hyperbolic- 
ity of the graph as seen by the random walk, which occurs at a critical radius 
equal to 1/4. Let T be a triangle formed by the origin and two points sam- 
pled independently from the hitting distribution on the sphere of radius an 
for a constant 0 < a < 1. Then when a < 1/4, if the geodesics are suitably 
chosen, with high probability T is 6-thin for some à > 0, whereas it is always 
O (n)-thick when a > 1/4. We also show that the hitting distribution of the 
sphere of radius an is asymptotically singular with respect to the uniform 
distribution. Finally, we prove that the critical behavior of this Gromov-like 
hyperbolicity constant persists 1f the two endpoints are sampled from the uni- 
form measure on the sphere of radius an. However, in this case, the critical 
radius is a — 1 — log2. 


1. Introduction. Let 4, be the set of permutations of (1,2, ...,n), and let 
0; be the continuous-time random walk on 4, that results when randomly chosen 
transpositions are performed at rate 1. Let d(o;) be the distance from the identity 7 
at time ż, that is, the minimum number of transpositions needed to return to J. In a 
previous paper, Berestycki and Durrett [3] showed 


THEOREM 0. Asn —> 00, d(0cn2) /n — p u(c) where 


CO l1k* ^ 
1.1 =] — == (6 MD ud 
(1.1) u(c) 2; u ce 


Although it is not easy to see from the formula, the function u(c) = c/2 for c <1 
and is « c/2 for c » 1. 


Received December 2004; revised September 2005. 
1 Supported in part by Rick Durrett's joint NSF-NIGMS Grant DMS-02-01037. 
AMS 2000 subject classifications. Primary 60G50, 60K35, 60D05; secondary 60CO05. 
Key words and phrases. Random walks, Gromov hyperbolic spaces, phase transition, random 
transpositions, random graphs, Cayley graphs. 


429 


430 N. BERESTYCKI 


We can think of o; as a random walk on the graph G,, with vertices 4, and edges 
connecting two permutations that differ by one transposition, so that G, is the 
Cayley graph of 4, associated with the set of generators S = {all transpositions}. 
Theorem 0 was proved by establishing a connection with Erdós-Renyi random 
graphs. The phase transition observed for o; is then related to the well-known dou- 
ble jump of the size of connected components of G (n, c/n) at c = 1. [Here and in 
all that follows, G(n, p) denotes the Erdós-Renyi random graph with parameters 
n and p, i.e., a random graph on n vertices where each edge is present indepen- 
dently of the others with probability p.] We refer the reader to Janson et al. [8] for 
this and other facts about Erdós-Renyi random graphs. 

In this paper we try to investigate some of the geometric implications of The- 
orem 0. We find a new connection between the speed of a random walk and the 
Gromov hyperbolicity of the space in which the random walk is evolving. 


Organization of the paper. In Sections 1.1, 1.2, 1.3 we present our results. The 
proofs of these results can be found successively in Sections 2-8. Each proof is 
preceded by a restatement of the corresponding theorem for convenience, and by 
an informal proof which outlines the main ideas used. 


1.1. Asymptotic hyperbolicity. The notion of hyperbolicity for a discrete struc- 
ture such as a group is a notion that goes back to Gromov [7]. As there is no deriv- 
ative, and thus no curvature available in a discrete space, the idea 1s to define what 
hyperbolic means using only elementary properties of the space. l 

One way to do this is as follows. Let (X, | - |) be a metric space, where |x — y] 
denotes the distance between x and y. For points x, y and p in X, define the 
Gromov inner product by 


2(x|y)p = lx — pl + ly — pl — |x — yl. 


(x|y)p thus measures how well the union of the geodesic segments [p, x | U[p, y] 
approximates a geodesic between x and y. Gromov's original definition of hyper- 
bolic spaces is as follows. Call X ó-hyperbolic if 


(1.2) (x|z)p > (xly)p ^ Olz)p — à 


for all x, y, z and p. This definition is not very intuitive at first, but fortunately 
there is an equivalent definition, which can be formulated using the notion of 6-thin 
triangle. A triangle (x, y, z) with geodesic sides 51, 52, 53 is said to be 5-thin if any 
side, say s1, lies entirely within distance at most ô of the two remaining sides: 


sı C {x € X, d(x, s2 U s3) x 6). 


The space is called -hyperbolic if all geodesic triangles are ó-thin, and it is sim- 
ply called hyperbolic if it is 6-hyperbolic for some ô > 0 (when ô = 0, the space 
isometrically embeds into a tree). It is not immediate, but not hard to check, that if 


GEOMETRY OF RANDOM TRANSPOSITIONS 43] 


all triangles (x, y, z) are 5-thin, then (1.2) is satisfied for some number 6’ that may 
differ by a constant factor from 5. Conversely, in a space where (1.2) is satisfied 
for all points (p, x, y, z), all triangles are ó'-thin, where 5’ may differ from ô by a 
constant factor. 

Of course a bounded space (in particular, a finite space such as 4,) is trivially 
hyperbolic, but we will be interested in situations where the constant 6 may or may 
not stay bounded as the size of the space tends to oo. 

Our first result makes the connection between Theorem 0 and Gromov hy- 
perbolic spaces, where we look at the two definitions of hyperbolic constants 
suitably weakened. For 0 < a < 1, let 0 B(an) be the sphere of radius an, that 
is, the set of points at distance [an| from the origin. We let v be the hitting 
distribution of 0B(an) by oi, that is, v is the law on dB(an) of or where 
T —inf(t > 0, d(o;) = [an |]. 


THEOREM 1. Let x, y be sampled from v independently, and set p = I, the 
identity element. 


1. Ifa < 1/4, then there is some ô < co (depending only on a), such that 
E(x|y)p x 9. 


Moreover, with probability asymptotically 1, there is a geodesic between 
x and y that comes within expected distance 5’ < oo of p. 
2. Ifa > 1/4, then 


E(xly)p ~ ôn 


for some 0 « 6 < co. Moreover, no geodesic between x and y can approach p 
closer than 5'n with probability asymptotically 1, where 0 < $' < oo. 


In the statement of the theorem and in the rest of the paper, a, ~ b, means that 
Ay [by > 1. 


REMARK. It follows immediately from Theorem 1 that when a < 1/4, with 
probability asymptotically 1, 


(x|z)p = (xly)p ^ (ylz)p — 


for independent x, y, z sampled from v, hence the idea that definition 1 of hyper- 
bolicity is satisfied “asymptotically v-almost surely.” The statement about the geo- 
desics shows that definition 2 is satisfied “asymptotically v-almost surely” when 
a< 1/4. 


. At this point we should emphasize that the result in Theorem 1 involves hyper- 
bolic constants that are different from the standard definitions discussed above in 
several important ways. The most obvious difference comes from the randomness 


432 N. BERESTYCKI 


of x and y, and from the fact that the roles played by x, y and p are somewhat dif- 
ferent. Here p is a fixed reference point, whereas Gromov's definition requires that 
every triangle should be thin. Another issue is that, corresponding to the second 
definition of hyperbolicity with thin triangles, we show that there exists a certain 
geodesic between x and y having the desired properties. As we will see below in 
Theorem 6, there may be a great many geodesics between two given points in 4,. 
More importantly, these geodesics can be far apart, as will show the following 
concrete example: 


c (114 5 11) (2 (3 9) (4 13 6) (7 12 8) (10) 
m D (14) (5) GD (2) G) (9) (4136 7 12 8) (10) 
m (1 14 5 11) (2) (3 9) (4) (13) (6) C) (12) (8) (10) 
myx, (11 5 14 1) (2) (9 3) (4 13 6) (7 12 8) (10) 


Since for any permutation m we have d(z) =n — st cycles of x, d(o) = 8. 
7; and x2 are on two geodesics from J to c, but d(71, 72) = dGtim; h = §, 
In general if d(o) = cn/2 with c < 1, and we divide the cycles at random into 
two groups, we can define zr, to have cycle structure given by the first group of o 
staying as it is and the second completely broken in cycles on lengths 1. If we 
define 7r? by the exchange of the two groups, then we will have d(o, zj) = cn/4 
and d (71,715) = cn/2. 


1.2. The geometry of Gn. How much can we learn from Theorem 1 about the 
global geometry of Ga? To answer this question, we need to see how special a 
choice it is to sample the points x and y according to the hitting distribution v. 
(The fact that p = I is a fixed reference point is not too important, due to the 
transitivity of Gn.) We begin by an apparently unrelated question, which is to ask 
how large is a ball of radius an. 


THEOREM 2. I[f0 <a <1, then as n — co, we have |B(I,an)| © (n? ina 
logarithmic sense, that is, 


log | BU, an)| 
m — =a. 
noo  nlogn 


This result is probably not new, but we have not found it in the literature. Our 
original motivation for studying the volume growth in G,, was to try to understand 
the phase transition of Theorem 0 in terms of the geometry of Gq. Our first thought 
was that since the speed was nonsmooth we might see a change in the volume 
growth. The above result contradicts this idea. 

To put our next two results into perspective it is useful to contrast them with 
Brownian motion B, on a d-dimensional manifold of constant negative curva- 
ture —1. In that case as t — oo, if d(B;) is the distance from the origin, then 
(see [11], e.g.) there 1s a constant v so that 


d(B;)/t — v as f£ — oo. 


GEOMETRY OF RANDOM TRANSPOSITIONS 433 


In the case of Brownian motion on hyperbolic space, rotational symmetry implies 
that the hitting distribution is uniform. In contrast for the random transposition ran- 
dom walk, we will see in Theorem 3 that the hitting distribution is asymptotically 
singular with respect to the uniform distribution on 9 B(I, an). 


THEOREM 3. Let |Ci| be the length of the cycle that contains 1. Under m, the 
uniform distribution on 8 B(1, an), 


ICi] >G 


where G is a geometric r.v. with P (G > k) = (b/(1-- b))* and b satisfies log(1 + 
b)/b—-1-—a. 


To describe the hitting distribution v, we note that (1.1) suggests that it will 
be the same as the distribution of o;,/? where c = u (a). When a > 1/2 this is 
much different from the distribution in Theorem 3 since in this case c > 1 and 
Schramm [12] has shown that 0,5; has cycles of lengths of order n. 

Here we will concentrate on what happens when a « 1/2 and c — 2a. In 
this case results in [3] show that as n — oo, the number of fragmentations be- 
fore time cn/2 is asymptotically a Poisson random variable with mean «(c) = 
—(log(1 — c) + c)/2. In particular, 


P (d(0cnj2) = cn/2) > e KO) x el AT ec. 


It will be convenient to approach the hitting distribution v by the distribution vo 
of Ocn/z2 conditioned on no fragmentation. More generally, if vj = v conditioned 
on exactly k fragmentations before the hitting time, 


k (cy* 
k! 





OO 
y =e KO) 2 Vk 
k=0 


4- o(1). 


To study vo, we recall the connection with random graphs developed in [3]: 
when we transpose i and j we draw an edge between i and j. In order for the 
distance from the identity to increase by 1 at each time, each transposition must 
involve indices from two different cycles and will merge them into one. In terms of 
the random graph, this means that all components are trees. Using results from [3], 
it is straightforward to show: 


THEOREM 4. Let C, be the length of the cycle that contains 1. Let c < 1. 
Under VO; 


k—1 
(ce"^Y . forallk > 1. 





1k 
EMCI ES) E 


434 N. BERESTYCKI 


Theorems 3 and 4 show that the uniform distribution yz and the hitting distribu- 
tion vo concentrate on different permutations. In the first case the number of fixed 
points will be close to its expected value n P(|Q1] = 1) = n/(1 + b). In the sec- 
ond it will be close to ne~° by Theorem 4. This is made precise by the following 
theorem. 


THEOREM 5. Asn — œ, the hitting distribution v and the uniform distribu- 
tion u on a sphere of radius an are asymptotically singular: 


dv (ps, v) 1. 


Let t = [cn/2] with c < 1. To understand why v is different from fz we will 
examine the Radon-Nikodym derivative r(o) = dvo/dp. It is not hard to show 
that 


THEOREM 6. Suppose d(o) =t and m,,...,mj are the cycle lengths of o. 
The number of paths of length t from I to o is 


j m;—2 
"TI RU p. 
"E (m; = 1)! 
i=l 
Ift = |cn/2] with c < 1, then 
j mi? 
=i 
r(a) ZI Dr 


where Ky is a constant that only depends on n and t. 


The last result enables us to prove a stronger version of Theorem 5: it tells us 
that the "support" of v is concentrated on a set that is exponentially smaller than 
the size of 8 B(an). 


THEOREM 7. Suppose a < 1/2. There exists a set $n € Ə B(an) such that 
v(S,) > lasn— œ and 


; [Sn 
| p E [9B(an)| Kos 

1.3. The hyperbolic constant under the uniform measure. In Theorem 1, we 
learn that if x and y are sampled from v, roughly speaking, the Gromov hyper- 
bolicity of the "support" breaks down at a — 1/4, that is, the hyperbolic constant 
increases suddenly from O (1) to O(n) at this point. 

However, the results from the previous section tell us that this "support" is (ex- 
ponentially) small with respect to the ambient space. It is therefore natural to ask 
what happens to Theorem 1 when we replace v with the uniform measure u on 


GEOMETRY OF RANDOM TRANSPOSITIONS 435 


8 B(an). Theorem 8 will show that the qualitative behavior of the hyperbolic con- 
stant remains the same. We prove that there is a threshold where the expected Gro- 
mov inner product E(o |), jumps from O(1) to O(n), but this time the critical 
value is a = 1 — log 2 7: 0.31, rather than a = 1/4. 

When c and x are independent uniform permutations on 0 B(an), by the 
transitivity of Gn, it is enough to analyze d(o,7) to understand (o |z)5, the 
inner Gromov product. Since d(o,z) = d(I,o tm), which has the same law 
as d(I, 0x), it will be enough to characterize the values of a for which dU, oz) = 
2an + o(n) and those for which it is < 2an. 


THEOREM 8. Let 0 « a « l and let o, x be two random independent points 
chosen uniformly from 0 B(an). Then: 
1. ifa «1-—]1og2, 
E(o|x), x 8(logn) 


for some 0 < 6 = ó(a) < oo. Moreover, with probability asymptotically 1, there 
is a geodesic between a and x that comes within distance at most 8(logn)? 


of p. 
2. Ifa > 1-—1og2, 
E(o|x)p ~ én 
for some à = &(a) > 0. Moreover, no geodesic can approach p closer than 8'n 
for some 0 < & < oo. | 


REMARK. The O((logn)?) bound in part 1 of the theorem could probably be 
improved into an O(1) bound ( just like in Theorem 1) with some more work, but 
we have not tried to do so. In part 2, by analogy with Berestycki and Durrett [3], 
we conjecture that the fluctuations are of order exactly n!/? in the supercritical 
regime. More precisely, it should be true that when a > 1 — log2, 


n^ V (E(o |n), — ôn) = (0, x), 
where ô is the limit in part 2 of the theorem, and x is some parameter. 


2. Asymptotic hyperbolicity under v. The first result we prove is Theo- 
rem 1. 


THEOREM 1. Let x, y be sampled from v independently, and set p = 1, the 
identity element. 
1. Ifa < 1/4, then there is some 8 < oo (depending only on a), such that 
E(x|y)p x à. 


Moreover, with probability asymptotically 1, there is a geodesic between 
x and y that comes within expected distance 5’ < œ of p. 


436 N. BERESTYCKI 


2. Ifa> 1/4, then 
E(x|y)p ~ ôn 


for some 0 < < co. Moreover, no geodesic between x and y can approach p 
closer than à'n with probability asymptotically 1, where 0 < 8' < oo. 


Sketch of the proof. Let X, and Y, be two independent random walks 
starting at the origin. Let them run until the times T and 7" where they re- 
spectively hit the sphere 0 B(an). Then the transitivity of the Cayley graph 
of a, and the reversibility of the increments of the random walk, imply that 
(Xr, Xr 1,..., p, Yi, ..., Yr) is a random walk path of length T + T’. Hence 
the distance between Xr = x and Yr; = y is the same as d(øor+r'). By The- 
orem 0, T and T’ % lu-(a)n, so applying Theorem 0 again, when a « 1/4, 
|x — y| © 2an = |x| + |y| [the random walk runs for a time 2an < n/2 and 
there are only O(1) fragmentations]. For a > 1/4, the random walk is run 
for time u~!(a)n which, in view of Theorem 0, means that c = 2u^! (a), and 
Ix — y| = nu(2u-!(a)) < 2an. See Figure 1. 

The claim about the existence of a geodesic that makes the triangle (x, p, y) thin 
involves necessarily another argument, since geodesics may be far apart. However, 
it is not very hard to construct by hand a geodesic between the identity and x such 
that each point of the random walk path is within O(1) of this geodesic. Applying 
this construction to the two random walk paths gives the result of Theorem 1. 


PROOF OF THEOREM 1. Let us first deal with case a < 1/4 and prove that in 
this case E(x|y)p < ô. Keeping the same notation as above, note that an + O(1) 
steps are sufficient for X to reach distance an. Indeed, after an steps, Xan is 
at distance an — X, where X, is twice a Poisson random variable by Theo- 
rem 1 in [3]. It is immediate that in X, steps the probability that X has a frag- 
mentation converges to 0. Therefore T — an => X, (remember that here time 


I 


RW! QR. 
4 


EE 


- ; 


FIG. 1. Two independent random walks run until they hit the sphere of radius an. 


GEOMETRY OF RANDOM TRANSPOSITIONS 437 


is measured discretely). Similarly T’ — an = X» where X1, X2 are ii.d. Hence 
(x = Xt, Xr, ..., X1, L Yi, ..., Yp = y) is a random walk of 2an + X1 + X2 
steps. In the worst case possible all X; + X2 steps represent "backward" steps 
(meaning, toward x rather than y). Hence if X5 = an — d(I, 0254) (so that X3 is 
also twice a Poisson random variable, but with a different parameter), 


2E(xly)p —2an — |x — y| 
< 2an — Ed, Can) + E(X1 + X2) 
< E(X1) + E(X2) + E(X3) < oo. 


It is slightly simpler to prove that when a > 1/4, E(x|y)p ~ ôn. Indeed, in this 
case, by Theorem 0, we have that 


lu-(a)—& x T/n x du | (a) +e. 
Therefore 


d(l,o;) < |x —yl < sup d(I, c). 


jt/n—u™! (a)|<2e |t/n—u-!(a)|<2e 


An easy estimate shows that we are never off by more than O(n 1/2) if we evaluate 
the distance of the random walk by counting the number of clusters of the random 
graph rather than the number of cycles of o;. But for the random graph, the number 
of clusters is monotone increasing. Hence, if œ denotes u(2u~!(a)), we have by 
continuity of u that 


<a+e’+o(l) 


E noe 
pee A 
n 


and z’ can be made as small as desired by continuity of u. Therefore 


Elx — 
Enc y! — a. 
n 


It suffices now to prove that a < 2a, that is, u(2u- (a)) < 2a or, after change of 
variable c — u (a), it suffices to prove u(2c) < 2u(c) for all c > 1/2. This fact 
is a consequence of the sublinearity of u: it will be proved later that u is strictly 
concave on [1, oo), from which it follows that u(c) > u(2c)/2. 

We now turn to the part of the theorem that concerns geodesics, and prove that 
for a random walk (X;,t < cn/2) of time-duration cn/2 with c < 1, there is a 
geodesic between o = X¢n/2 and J, that we call y, such that 
(2.1) E sup d(X;,y)— O(1). 

t<cn/2 
This shows that when c < 1 there is a geodesic that stays close to the random walk 
path. When a < 1/4, p — is on the random walk path that leads from x to y, so 
this shows that E(d(/, y)) = O(1), as claimed in the theorem. The case a > 1/4 
is trivial by the triangle inequality. 


438 N. BERESTYCKI 


Let 11,..., Ty be the sequence of transpositions that are the increments of the 
random walk path leading to o, so that o = r1... tw. Let y be the geodesic be- 
tween c and / defined by yo =o, yj = OTN, yo = yj TN-1,...-, until the first 
time ¢ such that multiplying y; by ty—; would result in a coagulation of two cy- 
cles of y,. We do not allow this possibility (otherwise y would not be a geodesic), 
and simply skip Ty—t: 441 = yrtN -:—1. We will see in a moment that this path 
never backtracks and that it ends at a bounded distance from Z, to which it will be 
necessary to add a (bounded) number of steps so that it actually ends at 7. 

Let n(f) be the index of the transposition to be performed at time t on y,. Note 
that we can always write 


yr 10... T. D] ti, 
icK, 
where K; is a set whose size we will show is bounded. Indeed, even when we 
Skip Tne) in y; [so that n(t) € K;4.1], the following transpositions 75(5—1,... com- 
mute with the members of K; with high probability and they can “jump above" the 
terms in K, and cancel the rest of the transpositions (71 .. . Tair)—1). 


LEMMA 1. Forallt, E(|K;]) x O(1), where O(1) is a constant that depends 


only on c « 1. As a consequence, the path ends at bounded distance from the 
identity and the distance E sup, 5 d(Xs, y) = O(1). 


PROOF. There are two ways to add a member to K;..; at time t. The first one 
is that performing Taç) will result in a coagulation, so that it is skipped by y. The 
other way is if Tnq) does not commute with one of the members of K;. 1, it stays 
stuck somewhere in K;. | 


If tary = (i, j), we claim that in order for i and j to be in the same cycle of y, 


i and j must belong to a component of the Erdós-Renyi graph associated with the 
random walk that contains a cycle at time cn/2. We will prove this in a moment, 
but if we admit this, then it follows that all transpositions in K; act on vertices that 
belong to U (cn/2), the unicyclic components of the random graph at time cn/2: 
if i € K;, then either t; — (i, j) yields a coagulation in yy, or it does not commute 
with some member of (k, 1) of K;~1, in which case (i, j) overlaps with (k, I). By 
induction, k,l € U (cn/2), therefore so are i and j. 

Let us prove our claim that if (i, j) would yield a coagulation in ys, then 
i, j € U(cn/2). Let us observe first that i and 7 must already be in the same com- 
ponent of the random graph: because Tag) was performed on the random walk, 
i and j were connected at that point in the random graph and they remain so. If 
i and j are in different cycles of o, then there must have been some ulterior frag- 
mentation in their cycles, so the claim holds. When they are in the same cycle of ø, 
then there must be some transposition t; with i € K; such that i and j are in dif- 
ferent cycles of y after v;. Call those cycles C4 and C». t; involves-two members 


- 
u^ 
à 


GEOMETRY OF RANDOM TRANSPOSITIONS 439 


k and l of C, U C2. Moreover the cycle structure of y before (x, 1) is performed 
must be of the form 


(NR RP E: 


otherwise (k,l) cannot separate i and j at the next step. Unless i and j belong 
to a complex component, this implies that the cycle structure of o has the same 
form. However, this can only happen if k and / were connected to the component 
of i and j at different times; otherwise the cycle structure would be of the form 
(i,...j,...,k,...,l) or Gi,...k,...,1,..., J). This implies in turn the existence 
of a cycle in the random graph component of i and j at time cn/2. 

From there it follows in a straightforward way that | K;| < |U (cn/2)| (Gn a uni- 
cyclic component there are as many edges as vertices). It is now standard in the 
theory of random graphs to show that |U (cn/2)| is bounded: 


1/2 oo pn k k(n—k)--(5)—k 
wam (5) EG) P(E) (5) 9s 


k=2 


(ce~°)* « oo 





e" oo pk+1/2 


8 k! 


k=2 


which completes the proof of the lemma. U 


Now let X, be a point on the random walk path. Since y tries to perform all rj 
(at reverse), there is a time s such that n(s) = t, that is, the next transposition to be 
examined by ys is t;. At this time, 


y -uullu 
icK; 


-so that | K,| steps are enough to reach y, from X;. Since E(|Ks|) < O(1) by the 
lemma, we have proved that 


E sup d(X;,y) x O(1) 


t<cn/2 


and Theorem 1 is proved. |J 


3. Large deviations and volume growth. The goal of this section is to prove 
Theorem 2, which we restate here for convenience. 


THEOREM 2. Jf 0 <a <1, then as n — oo, we have |B(I, an)| ^: (n? ina 
logarithmic sense, that is, 
log | B(J, an)| 
— — uq. 
noo  nlogn 


440 N. BERESTYCKI 


Sketch of the proof. The proof of the result is more interesting than the 
limit. We begin by recalling the dynamics of the Chinese restaurant process (see, 
e.g., [9]). Customer | enters and sits at table 1. At step i, customer i enters and 
starts a new table with probability 1/i or sits to the left of customer k where k is 
chosen uniformly at random in (1,...,i). From the tables we define a permuta- 
tion o by o (i) =i if customer i is sitting by himself at his table and o (1) = k if k 
sits to the right of i. It is easy to see that this defines a uniform random permutation 
on 4,, and that the cycle structure is given by listing the individuals at the tables 
in clockwise order. It is well known that if o € 4,, then d(o) = n— the number of 
cycles of c. In the Chinese restaurant process construction, let Z; be the random 
variables taking the value 1 if customer i sits at an existing table (and 0 otherwise). 
The ¢;’s are independent Bernoulli(1 — 1/7) random variables. Recall that if ø is 
a permutation, then d(o, J) = d(o) is n — #cycles of o. Hence, if o is uniformly 
distributed over 4,, then d(o) has the same distribution as Sa = } 5. ., £j. 

The i's where a new cycle starts (i.e., 7; = 0) are distributed with the same law 
as that of the occurrences of records for i.i.d. variables with continuous distribution 
function (cf. [5], Example 6.2 of Chapter 1). From calculations in that example it 
follows that (n — $,)/logn — 1 in probability. 

Returning to our calculation of the volume of the ball, 


[B(I,an)| =n!P(S, «x an) 


for all 0 < a < 1. It is straightforward to generalize large deviation results for 
i.i.d. random variables (see, e.g., [5], Section 2.9) to prove Theorem 1. One begins 
with the observation that for X > 0 


(3.1) P(S, x an) x e"" ge^ 


optimizes the upper bound over A and uses a change of measure argument to prove 
a corresponding lower bound. 


PROOF OF THEOREM 2. Let (£j, i > 1) be independent with P(¢; = 1) = 
1 — 1/i, and let S, = Y 7 4 &. Since (logn!)/(nlogn) — 1 it suffices to show: 


LEMMA 2. LetO<a<1l.Asn—- co, 
lim log P[S, « an] — 
n— 00 nlogn 


PROOF. Let g,(A) = E[e ^9]. Using the definition we have 


9240) = fh E jet ulla. 


i=l i=l 


GEOMETRY OF RANDOM TRANSPOSITIONS 441 


where = indicates that the last equation is the definition of gj. By Markov’s in- 
equality we have 








1 
(3.2) log PIS, < an] < n(àa + log Øn o) for all À. 
If we define 
F, (x) = : | edF" (y) 
Pn (A) OO ' 
then F, is a distribution function such that 
9, Q-) d Pp A) 
mean( F) = — and = var(P),) = — —— > 0. 
nl) dd gn (A) 
To optimize (3.2), we want to choose 4 so that 
1 7 
a — 9250) =U. 
n On (A) 


This says that the mean of the transformed distributions is na, so 
iene do Gabe? 
fco 2. EE OE 2. ree res ne ^ 
no di n: (i —1)e7^ 4-1 


We guess that the optimal à must be given (asymptotically) by e ^»* = b/n. Plug- 
ging this in the above gives 
17-1  jb/n l bx 
= — ————— —À 
n (jb/n) +1 0 bx+1 





1 
dx =1— z log(b + 1). 


From this we see that we should choose b so that log(b + 1)/b = 1 — a. 


Upper bound. Let us calculate what (3.2) gives with this choice of A: 


1 1 ii ] n 
(3.3) —logg,(A) = —logn+—log] | ((1 = -)b + =| 
n n t l 


i=l 
1 
(3.4) — —logn +f log(b + 1/x) dx. 
0 


Since the last integral is finite it follows from (3.2) and Agp: = — log b + logn that 





lim sup P(S, Ena)za-1, 


noo HRK1OgH 


proving the upper bound half of Lemma 2. 


442 N. BERESTYCKI 


Lower bound. The argument is similar to that in [5], page 73. Fix any v <a 
and v < v' <a. Define a real number b’ by 


log(1 4- b’) : 
— =) ap, 
For any A, 


PIS, san] x | " dF"(x) > | — e" e) d FQ) 
> Pn (Aye [Fy (na) — F(nv)]. 


First, we prove that we can choose A such that [F (na) — Fy (nv)] — 1. Recall 
that the mean of F is - £40. and that the latter function starts at n — logn for 
à = 0, is strictly decreasing and equals na when A = Aopr = — log b + logn, that 
is, e^ ^o* = b/n. Thus if we pick A =A! such that e^^ = b'/n, the mean of Fy 
is by the lower bound calculation exactly nv’, and we have chosen v « v’ « a. 
To conclude that Fy (na) — Fy (nv) — 1, instead of using a law of large numbers 
argument such as in the i.i.d. case, we simply compute the variance of Fy directly. 
Anticipating on the calculations of the next section, breaking the factor e^^* in the 
Radon-Nikodym derivative of F, into e^^2-*i means that we can see F, as a sum 
of independent Bernoulli random variables with parameter fj so that the variance 
is 


var Fy =) | Bi — Bi) < ) bi = mean(F,) = nv = O(n). 


i=l i=l 


Another way to obtain this inequality is to do more direct computations: 


On (A) (2 ay 
F, = te | 
ee On) \Gn 
= Y e~*(1—1/i) p e^(1— 1/i)e^^( — 1/j) 
i=l qi ii qig j 


n e7M(1— un) 
X di 


pee Vi) «e?Q-1/iy 

i=l di i=] q; 

2 D e^^(1— 1/i) = 9,0) 
qi Gn{A) 


i=] 


= mean F}. 


GEOMETRY OF RANDOM TRANSPOSITIONS 443 


Since the variance is O(n), by Chebyshev’s inequality we have that F} (na) — 
F, (nv) — 1. Therefore, 


log P[ 5, <an] = 


lim inf — ]. 
n 00 nlogn 


But v is arbitrarily close to a, so the result is proved. L 


4. The uniform measure on à9B(an). Let {¢;, 1 <i <n} have the distrib- 
ution of (Z;, 1 < i x n) conditional on » 7 44; = Lan]. Let pur ]«i nj be 
independent with distribution 


dF, i(x) = e ^ dF;(x), 





$i) 
where F;, $; are respectively the distribution function and the Laplace transform 
of ¢;, and A is the optimal parameter of the previous section, e ^^ = b/n. It is easy 


to see that de is another Bernoulli random variable with 
Lo 1 u 
Gi) 1 n/O6G 1). 





Pe? SS Pigalle * Bi. 


We are now ready to prove: 


THEOREM 3. Let |C,| be the length of the cycle that contains 1. Under p, the 
uniform distribution on 0 B(I, an), 


iC] >G, 


where G is a geometric r.y. with P (G > k) = (b/(1 + b))* and b satisfies log(1 + 
b)/b —1-—a. 


Sketch of the proof. The first part of demonstrating this is to recall what 
Arratia, Barbour and Tavaré [1] call the Feller coupling. Start with vertex 1 and 
choose o (1) uniformly from the n possible choices. If this is 1, then take vertex 2 
and choose o (2) uniformly from the n — 1 remaining possible choices. If o (1) Æ 1, 
then choose o (o (1)) uniformly from the n — 1 remaining choices, and so on, until 
the final vertex where there is only one possible choice. Although the construction 
is much different from the Chinese restaurant process, the reader should note that 
if £j is defined by é; = 1 if a cycle is not completed at the ith stage and 0 otherwise, 
then (5;:1 <i x n) and [£;:1 € i <n} have the same distribution. 

From the last observation it follows that N = inf{i:& — 1} has the same dis- 
tribution as the length of the cycle containing 1. We can now conclude the proof 
of the theorem, using the large deviation calculation of the volume, and an argu- 
ment called the Gibbs conditioning principle (see [4]). This principle asserts that 
the distribution of the ¢; conditional on ?77., £j = an should be asymptotically 


444 N. BERESTYCKI 


independent and their law given by that which minimizes the entropy, that is, the 
random variables g with distribution 





(4.1) e * dF;(x) 


1 
$i(X) 
where F;, 9; are respectively the d.f. and the Laplace transform of 7;, and A is the 
parameter that optimizes (3.1), that is, e^^ = b/n. 


PROOF OF THEOREM 3. We will first need a lemma. 


LEMMA 3.  Forany n > land for every X > Q, 


51, 0005 6D S (6,00. ) given Y c? = Lan]. 


i=l 


PROOF. Let fi,..., f, be bounded nonnegative Borel functions: 


«(TI fc) = eù en fae] Lt jan) 
ix] i=) 





n n -1 
= E( nen ef nu nie: = iani 
i=l i-i 


On the other hand, 


e( nt ies s Oy yc Q) _ = lan) 


i=] 


=f AED Altieri’ die 


i=] 


= f. J AEn sa; PYES A 


e ^an 
 TDRa60) DE TOP I fi Cie xi lanl) i dF; (xi) 
z | ACD f G0: PG = Lan i} 

[T5 PIA) oa 


We can now divide and multiply by the probability of the events in the two sides 
of this equation to obtain that for some constant C > 0 


E(TLo s ifo an) =ce( finos = in 
i=] i=l 





—~al 


GEOMETRY OF RANDOM TRANSPOSITIONS 445 


By taking fı =---= fn = 1 we see that C = 1 and the lemma is proved. O 
We will need another lemma: 


LEMMA 4. The pU satisfy a local central limit theorem: 
p(d3 = jan) ~ Cn, 


PROOF. The proof of this local limit theorem follows very closely that of the 
usual i.i.d. case, which can be found in Theorem 5.2 of [5]. Let Bm = P (m M es 1) 
[i.e., Bm = (1 +n/b(m — 1))-1], and let Xm,n = n7"? (pg? — Bm) be the rescaled 
Bernoulli variable. We start by noticing that Xm,n satisfy the hypotheses of the 
Lindeberg-Feller theorem (Theorem 4.5 in [5]). Indeed, they are independent by 
definition; for all € > 0, P(|Xm.n| > €) = 0 as soon as n M? < e, since c9 <j 
and By», < 1 as well. Moreover, 


n 1 n 
2, EGGS) = — , Bm (1 — Bm) 


l x/b 03 
— I x Gb t by dx :—96^*. 


Therefore 5 5, .| Xm,n => N (0,0). At this point, the proof of the local limit theo- 
rem from [5] can be reproduced exactly. Therefore 


PL `> Xm,n =») — n(x) 


m=1 


sup 7t — 0, 


xeER 








where n(x) := (2z10?) |? exp(—x?/207). Since X> —] Bm — an, and since n(-) 
is a continuous function, we can conclude the proof of the lemma by the above 
uniform convergence. LU 


Now, by the Feller coupling, |C| £ inf(k > 1:0 40), that is, we must re- 
verse the time of the Chinese restaurant process. Hence by Lemmas 3 and 4: 


P[G] >k] = Plz 1... 0] uua Hl] 


Pla 1. Paetos = an 


l e ee | 
— o Cum t6; = Lan] —k 
P[Y 6 = lanj] "lH 2. 


446 N. BERESTYCKI 


1 k—1 


n—k—] 
Auc BA e ecd I] P|c = rl y p = |an] -x 
i=] 


PIZ? 4c? = Lan!) jo 


1 1 
^ T4n/6M—D)  Icn/( -—E) 
DEM QM 

(14-1/by 


Hence Theorem 2 is proved. C 


5. Asymptotic singularity between x and v. In this section we give a proof 
of Theorem 5 that follows in an almost straightforward way from Theorems 
2 and 3: v and u concentrate on permutations that have a different number of 
fixed points. First recall the statement of the theorem: 


THEOREM 5. Asn — oo, the hitting distribution v and the uniform distribu- 
tion 44 on a sphere of radius an are asymptotically singular: 


dry (p, v) > 1. 


LEMMA 3. The random partition of (1,...,n) derived from v is exchange- 
able. 


PROOF. The probability to obtain a certain partition of {1,...,} under v only 
depends on the size of its blocs, which stays the same under the action of a given 
permutation. Hence v yields an exchangeable partition of (1, ..., 5). O 


An immediate consequence is that the expected number of fixed points is 
nv(Cj = 1) =n/(1+ b). Next we show that under v the number of fixed points N 
is close to its expected value. 


LEMMA 6. 
var N — oln?) 
under v. 
PROOF. Let x; = 1l, £150; 6,0] be the indicator of the event that in the condi- 


tioned Chinese restaurant process, client number i sits by himself. Then N = » /; x; 
and 


n 
var N — y varxi Yd j y | cov(xi, Xj) 


<na+ 3 Y cov(x;, xj). 


i<j 


GEOMETRY OF RANDOM TRANSPOSITIONS 447 


But when j — i > 1, by the Gibbs asymptotic independence proved in Theorem 3, 
cov(x;, xj) — 0. Also, there are only O(n) terms such that j =i + 1 and in this 
case cov(x;, Xj+1) < 1, hence the sum 5 ;; | j cov(x;, xj) = o(n?). LI 


To end the proof of Theorem 5 by Chebyshev’s inequality there remains only to 
notice that: 


LEMMA 7. ForQ «a < l and large enough n 
vey] = 1) Z a(i] = 1). 


PROOF. Recall that b is defined by log(1 + b)/b = 1 — a. For x € (0, 1), let 
f(x) =1—log(1 + x)/x, so that b = f7} (a). 

On the other hand, an easy consequence of Berestycki and Durrett [3] or Theo- 
rem O 1s “(|| = 1) = e" (0), [Indeed, under jz, |C1| is asymptotically the total 
progeny of a Poisson-Galton-Watson process, or PGW process with offspring 
mean u^! (a).] 

Hence the lemma is proved if we show that 


1/ü-Eb) Xx e @ or u(x) Z1—x/(e* — 1) 


for all x > 0. 

We start by noticing that as x — 0, u(x) ^ x but 1 — x/(e* — 1) ~ x/2. Hence 
u(x) > 1 —x/(e* — 1) as x —> 0. The same is true as x — oo [an easy argument 
shows indeed that u(x) = 1—e * --o(e *)]. Now those functions are both concave 
as we will see in a moment, hence this has to stay true on the whole open half-line 
x > 0. (Notice that we have thus proved that the hitting distribution has always 
fewer fixed points than the uniform distribution.) C] 


LEMMA 8. The function u appearing in Theorem 0 is concave. 


PROOF. Forc x 1 this is obvious. When c > 1, rather than carrying explicit 
calculations on the second derivative of u, we use a theoretic argument that ex- 
ploits the recent result of Schramm [12], which says that the sizes of the pieces of 
the giant component in the random graph have approximately a Poisson—Dirichlet 
distribution. Since each fragmentation decreases the distance by 1 and each coa- 
Jescence increases it by 1, it is easy to see that 


d 
j; Ed (oon, I)| $2] = 1—- 2P|fragm.|55,2], 


where F. is the canonical filtration generated by the random walk. So we need to 
show that P[fragm.] is an asymptotically increasing function of c. However, the 
probability of fragmenting a small cycle is asymptotically 0 (by duality and the fact 


448 N. BERESTYCKI 


that u is linear in the subcritical regime), and the probability of fragmenting one of 
the giant cycles can be computed explicitly using the Poisson—Dirichlet structure: 


oo oo 
P[fragm.] > E (6(c)X;)^ -0(c) EM X? = 400°, 
i=] i=1 
where @(c) is the survival probability of a PGW (co) branching process and 
(Xj, i = 1) follows the P D(1) distribution. (E X X? = 1/2 follows from [10], for- 
mula (128).) Since 0(c) is an increasing function of c, the lemma is proved (and 
thus, so is Theorem 5). O 


REMARK. We have thus proved the formula 


c/2 
u(c) =c/2- | 8(u)* du 
0 
which is perhaps a little simpler to handle than the expression in Theorem 0. 


6. Number of geodesics and Radon-Nikodym derivative. Here we prove 
the following theorem, which we will then use to prove a stronger version of the 
singularity theorem. 


THEOREM 6. Suppose d(a)=t and m,,...,mj are the cycle lengths of o. 
The number of paths of length t from I to o is 


j mini —2 


giro | (ni = 1)! 


From this it follows that if t = [cn , 2] with c « 1, then 


m; eD 


r(o) = iea nes 


where Kn.: is a constant that only depends on n and t. 


Sketch of the proof. To see the first result, note that in order to go from o to J 
in the shortest number of steps we must increase the number of cycles by 1 at each 
step, and to do this we must fragment a cycle at each step by transposing two of 
its elements. A cycle of length m; will require m; — 1 fragmentations. The first 
step in constructing a path is to decide on how to allocate the t moves between the 
original cycles, which can be done in t!/]]/_, (m; — 1)! ways. The next step is to 
count the number of ways that we can reduce a cycle of length m; in m; — 1 steps, 
which turns out to be simple: m; m 


PROOF OF THEOREM 6. Given a partition of {1,2,...,n} into groups 
A1,..., Àj of sizes m;, 1 x i x j, the number of forests that consist of trees 


GEOMETRY OF RANDOM TRANSPOSITIONS 449 


with vertex sets A1,..., Aj is by Cayley's formula for the number of unrooted 
trees on mm; vertices 


Let t = Y; mj — 1). A given forest can be built up in t! ways so there are 


J 
! mi —2 
"Imi 
i=] 


paths for our random graph process that end up producing a given partition. The 
number of permutations that correspond to a given partition is 


j 
] [Gi - 0t. 
[1 


An equal number of paths end at each permutation with cycle sizes mj, 1 <i <j, 
so the number of paths to a given permutation 1s 
J m? —2 


a 0m - DE 


If t = [cn/2] with c < 1, then the number of edge choices that end up producing 
no fragmentations is by Theorem 1 in [3] 


(9) 


Taking the ratio of the last two results gives Theorem 6. O 


7. 'Yhe size of the support of the hitting distribution. In this section we 
prove Theorem 7, restated below. 


THEOREM 7. Suppose a < 1/2. There exists a set S, € 0B(an) such that 
v($,) -> lasn — œ and 


nom ^ |dB(an)| 


Sketch of the proof. Obtaining a decay at least exponential is not very hard, 
even in the case a > 1/2. However, it is not easy to prove that this is the correct 
rate for the decay of |S|/[3 B |, and we restrict ourselves to the case a < 1/2. 

If o € 3 B(an), then we can use Theorem 6 to find that 


n 
log vo(o) = —an logn + an + Y ` ax log px + o(n), 
k=l 


450 N. BERESTYCKI 


where p, is the Borel distribution with parameter c, and a; is the number of 
cycles of o of size k. But by the law of large numbers, vo(a;/n) should have 
a limit as n — oo. Hence there is a set $ such that (logvo(o) + anlogn)/n 
has a limit —c; whenever o € S. Because vo(S) © 1, |S| © exp(anlogn + cin). 
Moreover it is also true that v(S) — 1. On the other hand, precise estimates on 
the size of 0 B(an) obtained via Kolchin's representation theorem tell us that 
19 B(an)| = exp(anlogn + con + o(n)). (A statement of Kolchin's theorem can 
be found below.) Thus, the theorem holds with y = cı — c2. To prove that y Æ 0, 
we argue that the decay has to be at least exponential (a consequence of Kolchin's 
representation theorem). 


PROOF OF THEOREM 7. We will need precise estimates on the size of 0 B (an). 
Because we need estimates to order higher than just n logn, sticking to the large 
deviation approach is not good enough. Rather, we will use Kolchin's representa- 
tion theorem. We would like to thank Jim Pitman for pointing out this reference to 
us. 

Suppose we can partition {1,..., n} into a certain number of clusters, which 
can all have different internal states. To be more specific, suppose that each par- 
tition of (1,...,7] into k clusters leads to v, possible global states of the system 
(1, ..., n], and that we can further assign each cluster of size j one of w; pos- 
sible internal states. We call such a combinatorial structure a (v, w)-partition (of 
(1, ..., n)). Kolchin's representation theorem answers with probabilistic means to 
the following purely combinatorial question: how many different (v, w)-partitions 
are there? Also, what does a random, uniform, (v, w)-partition look like? 

Before going into the details of this theorem, let us see its relevance to our 
problem. The number of permutations at distance an from the identity is a special 
instance of the above Kolchin problem, where vg = 1g—(1-4),) and w; = (j — 1). 
Indeed a permutation at distance an is exactly a permutation having (1 —a)n cycles 
and each cluster of size j can be in one of the (j — 1)! possible orderings of the 
cycle. 

Here is the content of Kolchin's theorem (see [10]). Let v(0) = 7-4 viO* / k! 
and let w($) = ps w je! /j! be the so-called exponential generating function of 
the sequences v and w. Let K be an integer-valued random variable with distribu- 
tion 

w(é)* 
kluw) 
and let X be a random variable distributed according to 
Ww j$ J 
jiwé) 
Here & is any parameter. In our setting, K = (1 — a)n, as. and X has the so- 
called logarithmic distribution, P(X = j) = b//j - seach for some parameter 


b = w(£). 


P(K =k) = ux 


P(X =j)= 





- 


GEOMETRY OF RANDOM TRANSPOSITIONS 451 


THEOREM 9 (Kolchin). The number of (v, w)-partitions is given by 


2 "(Xx n), 


where X; are i.i.d. samples of the variable X. Moreover, the sizes of the clusters in 
exchangeable random order have the same law as 


(X1,..., XKR) given Xj +---+ Xx =n. 


For a precise definition of exchangeable random order, and further discussion 
of this theorem, see [10]. It is to be noted that here 5 is any parameter. By playing 
on this parameter so as to make the event Sx =n not unlikely (e.g., of probabil- 
ity œ n /* rather than exponentially small), we get that the sizes of the clusters 
are approximately drawn from the r.v. X. Note that as a consequence we get here 
another proof of Theorem 3. Indeed, we see that the sizes of the cycles of a uniform 
permutation on ðB in exchangeable random order have a logarithmic distribution 
(asymptotically when the parameter is chosen a i a size-biased pick 
|G1| should have distribution P(X’ = j) = const. j - 2 x b/, a geometric ran- 
dom variable. The similarity between the large deviations-statistical mechanics 
approach and Kolchin's theorem is striking. 

Another straightforward consequence of this theorem is the precise asymptotics 
for the size of a ball of radius an. Indeed, in our setting, v(0) = 08079" /[(1— a)n]! 
and w(&) = —log(1 — €), hence: 


ni(—log(1 -£y079" [AR L 
"ER Les 


where & is still any parameter. However, when £ is chosen such that (1 — a) x 
E(X) = 1, the local central limit theorem shows that PEE 0n x; — n) 
Cn~'/2, By Stirling's formula, it is now straightforward to see that 


|3 B(an)| = 
i=l 


|3 B(an)| = exp(anlogn + con + o(n)). 


Let us now turn our attention to the hitting distribution. We will get the corre- 
sponding estimate by analyzing the Radon—Nikodym derivative r (o) and the law 
of large numbers for v, as mentioned in the sketch of the proof. 

More precisely, it follows from the proof of Theorem 6 that if o € 3B (an), with 
cycle decomposition of size m1,...,™(1~a)n, and t = an, then 


n(1—a) mini 


| i 
volo) = Geo" I] inc 


i=] 


452 N. BERESTYCKI 


Let us write ag for the number of cycles of o of size k, so that 9 5. ak —n(1 —a) 
and 5, , kay =n. We can rewrite the above as 


woo) == mp LLL r LE d 22 A-a 1 
= : = ce C IMMER 
()e7*O jhe k—D! (ce *)" 


When we take the logarithm, calling gj = LE (ce) and py = kg, 





log vo(0) = k(c) — t log n 4 logt! 


n 
+ X ax log pr +n — a) loge — nlogc + cn. 
k=l 


Recalling that t = an, c = 2a, and using Stirling's formula, we find that 


H 
(7.1) log vo(o) = —anlogn + an + ` ag log pk + o(n). 
k=] 


We would now like to use the law of large numbers for vg since we know that 
under v, 





v({o «an: -a «EE; vi<k<nl)— 1 
n 


for given £ > 0, and where py is the Borel distribution. But it is not directly pos- 
sible to take S = (o € 0B:|% —gy| <£; VI <k <n} since we would obtain a 
bound £ $775, log py = oo. So we need to modify our choice: let 


Sy =o ean: 








a 

ee Gk| € (log n) ^; VIl<k< (log n)" and aj == 0 otherwise}. 
n 

There are two things we need to check on 5,. First we need to see that it has the 
property that v(S,) — 1, and also that it has the correct size asymptotically. The 
first thing is taken care of by the next lemma, while the second will follow from 
the fact that vo(S,,) — 1, itself also a consequence of the lemma below. 


LEMMA 9. v(S,) —> 1. 


PROOF. It suffices to prove that v(3 B — Sa) — 0. By the coupling with an 
Erdós-Renyi random graph, there is a $ > 0 such that no cycle can be greater than 
logn with high probability under v. Hence it remains to prove that 


2 
(ogn) ( a 
v 


3. — q| < dogn) 5) > 0. 
k=l 





The basic idea is to use random graph estimates. Let G(t) be the result of a random 
graph process where edges are added in a Poissonian way at rate (5)- Then the 


GEOMETRY OF RANDOM TRANSPOSITIONS 453 


expectation of the number of clusters of size k a, in G(an), is known to be nq 
asymptotically with standard deviation O(n'/*), which is much smaller than the 
n(log n) ? from the definition of $,. So we need to show this still holds under v. 
Recall that we can couple the process (G (t), t > 0) with a random walk (o;, t > 0) 
where we multiply by a transposition (i, j) whenever edge (i, j) arrives in G(t). 
Thus we may consider the first time T that o is at distance | an] of the identity, 
and obviously a realization of v is obtained as or. We consider also T" the first 
time that [an | edges have been added to G(t). Then since T’ is the first time a 
Poisson process with intensity 1 exceeds the value [an], T” has mean [an] and 
variance O(n). Hence the number of edges added to G between |an] ^ T’ and 
lan | v T' is O(nl/^). On the other hand, only a bounded number Zan of edges 
are added between T" and T, corresponding to the number of fragmentations at 
time az, and thus all in all only O (nl?) edges are added between an ^ T and 
an V T, so this may create or destroy at most O(n!/2) clusters of size k for each 
] x k < (logn)?, which is much smaller than the n(log n) ? from the definition of 
Sn. Hence we conclude that 


dogm? aa 
> «( zd — qu < ogm-5) — 0. 
k=1 á 





E 


Thus v(dB — Sn) — 1. Since vo is obtained as an asymptotically nondegenerate 
conditioning of v, it follows immediately that v9(0 B — Sn) > 0 as well, that is, 
vo(Sn) — 1. We now show how to use this to estimate the size of $,. By (7.1), for 
allo € Sn, 


2 
log vo(o) + an logn dogm) 2 
i a >a+ » (gk + ogn) 5) log pk +0(1) 
k=l 
logn)? 
>a+ »; qklogpk+o(1), 
kl 
from which we deduce that 


ae | = 
lim inf — (log vo(o) + anlogn) =a + 2. qx log py := —c1. 


After similar treatment for the lim sup, we get 

I 

- (log vo(o) + anlogn) > —c;. 
Since 


vo($,) = >) vo(o) > 1, 


GES, 


454 N. BERESTYCKI 


it must be that |$,| = exp(an logn + c1 + o(n)). Therefore 


I [Sh 
m —log 


—— = C] —C) y. 
noon - |3B| a= 


It now remains to show that y + 0. Observe that by Kolchin’s theorem, we could 
pursue the asymptotic expansion of |ð B(an)| and the next term would be polyno- 
mial in n. From the exact formula of v(o), we could also find the next term for | S| 
and find that it is polynomial. Hence if y = 0, then we would have |5|/]8 B] ~ n^? 
for some a > 0. But, another consequence of Kolchin's theorem is that the decay 
has to be at least exponential: for instance, permutations in S have a number of 
fixed points characteristic of v and not of u. As we have seen earlier the number 
of fixed points under jz, n/(1 + b), is smaller than under the hitting distribution. 
But since the number of fixed points under yz is given by a sum of almost indepen- 
dent random variables, 


n--|an | n—|an | 


2 liy;-1) given > X, —n, 
isi 


i-1 
we have that 


1 n—|an| 1 n—|an| 
a(S) < PG » dece D 9. Xi =n) 
is] T ixl 


n 
/ 
« Cn? e^"? « C'e "P 


by standard large deviations (here, simply Markov’s inequality), and because the 
event on which we condition is of probability Cn~!/*. Hence the decay has to be 
at least exponential and y cannot be 0. |] 


REMARK. The same argument shows that the hitting distribution of o is sup- 
ported on a set at least exponentially small even in the case a > 1/2, but of course 
we do not know whether this is a precise asymptotics. If the decay is still expo- 
nential after a — 1/2, it seems likely that the exponential coefficient will not be 
smooth at a = 1/2. In Figure 2 we have plotted the value of this coefficient against 
a time-change of a. It would be interesting to compute exact asymptotics in the 
case a > 1/2 and make this picture complete. 


REMARK.  Kolchin's representation theorem could have been used already 
earlier for the proofs of Theorems 2 and 3. This would actually simplify the proof 
of both results. However, we have chosen to keep the proofs as they were, because 
they do not rely on a technical result such as Kolchin's theorem, which is not as 
well known as standard large deviation theory. 


GEOMETRY OF RANDOM TRANSPOSITIONS 455 


14 


12 


10 


0.2 0.4 0.6 0.8 1 


X 


FIG. 2. Numerical evaluation by Maple of the limiting behavior of —n7 log By. The re- 


sult is plotted as a function of £ = f^ (a), where f(E) =1 + log — £)(1 — £)/E. f^! is 
an increasing function of a. This picture has been rigorously proved only for a « 1/2, that is, 
E < f71(1/2) + 0.715331863.... 


8. Asymptotic hyperbolicity under the uniform measure. Here we present 
a proof of Theorem 8. The sketch of the proof below contains some ideas that will 
be used and not re-explained in the actual proof that follows. 


THEOREM 10. Let 0 «a <1 and let o,z be two random independent points 
chosen uniformly from 0 B(an). Then: 


456 N. BERESTYCKI 


I. fa«1-—]1og2, 
E(o |zt)p < 5(logn)? 


for some 0 < ô = 8(a) < oo. Moreover, with probability asymptotically 1, there 
is a geodesic between o and x that comes within distance at most 5(logn)* 
of p. 

2. Ifa > 1 -—log2, 


E(o|x)p ~ ôn 


for some 5 = 8(a) > 0. Moreover, no geodesic can approach p closer than 5'n 
for some 0 < 5’ < oo. 


Sketch of the proof. To guess what the answer is, we exploit once again the 
connection with the theory of random graphs. The first thing to do is to realize 
that because of the symmetries of the Cayley graph G,, it is enough to look at 
d(I,o - x) and see whether it is approximately 2an or much smaller than 2an. 

To construct our graph, we will need some notation. Let 


(8.1) N = TIT- - -Tan 


be a minimal decomposition of x as a product of transpositions, with the follow- 
ing convention. If we list all cycles of x in the order of their least element, then 
the transpositions t; are those (x, y) such that y comes just after x in the cyclic 
decomposition of x and x and y are in the same cycle, and we order the an trans- 
positions according to their position in this canonical decomposition. To clarify 
the ideas, suppose 


z = (1 43 7)(2)(5 8)(6 10 9); 
then we write 
x = (1 4)(4 DB 7)(5 8)(6 10) (10 9). 


We define the graph r = (V, E) on n(1 — a) vertices as follows. Let 
V = {cycles of o), and there is an edge between C and C’ if there is x € € and 
y € Œ such that (x, y) is one of the an transpositions in the minimal decomposi- 
tion described above. Note that this graph could have self-loops and multi-edges. 

A notion that we will use on several occasions is that of being a terminal point. 
We say x € (1, ..., n] is terminal if x does not appear more than once in the trans- 
positions of the above minimal decomposition. This means that, with those con- 
ventions, x is situated at the “end” of the cycle of x in which it is contained. 

Here is why we are interested in the properties of the graph I. If we de- 
fine oo = o and, for 1 <r < an, Or =o -t---t,, and consider the process 
(o,,0<r x an), this is a walk on G, starting at o and ending at ø - zr. Moreover, 
since at each step we are multiplying by a transposition, the cycles of o, evolve ac- 
cording to a discrete coagulation—fragmentation chain, with cycles merging when 


GEOMETRY OF RANDOM TRANSPOSITIONS 457 


the transposition involves elements from different cycles, and cycles splitting oth- 
erwise (as it is the case for simple random walk on Gn). Therefore, I' is the graph 
that results from drawing an edge between two cycles of o as we encounter a trans- 
position joining those two cycles. In particular, the same argument that shows that 
the Erdós-Renyi random graph is an upper bound for the sizes of the cycles of 
simple random walk on Gn, will show that the cycles of ø - x are subcomponents 
of the connected components of I, with possibility of fragmentation whenever 
there is a cycle in I’, or a self-loop or a multiple-edge. All other edges represent 
coalescence of cycles in the walk (o,, 0 <r <an). 

In particular, the property of I" that we will be most interested in, will be to de- 
cide whether T has a giant component, meaning a component containing a positive 
fraction of all n(1 — a) vertices. Indeed when all cycles of F are small, we should 
expect very few cycles in I’ and hence little fragmentation. Hence most steps of 
the walk o, are coalescence events and the number of cycles decreases linearly; 
in other words, in the case that all cycles are small, d (o, x) * 2an. On the other 
hand, if T contains a giant cycle, then we can expect many cycles in the graph and 
hence many fragmentation events in the walk (0,, 0 <r < an), which means that 
d(o, x) < 2an. 

Here is our strategy to see whether there is a giant component in I. Rather 
than counting the number of cycles of o that a component of I contains, we pre- 
fer to compute the exact number of integers in {1,..., n} that it actually encloses 
prior to shrinking all cycles of o into points. Formally, this means, give weight 
W(C) = |C] to any vertex C of I’, and ask what is the total weight of a connected 
component of I’. Let €; (I) denote a size-biased pick from the connected compo- 
nents of I’, that is, the total weight of the connected component of I' containing “1” 
[or, more precisely, C1(o)]. 

Lemma 10 shows that W(C(D)) converges in distribution to the total progeny 
of a branching process with offspring distribution a shifted geometric random vari- 
able. The idea is that by Theorem 3, the various cycles of o are asymptotically 
ii.d., so that each edge in I^ adds to the weight of C4(I) a contribution which 
is, by Theorem 3, asymptotically a geometric random variable G with parameter 
1/(1 + b) where b satisfies log(1 + b)/b = 1 — a. This seems to give an infinite 
progeny almost surely (since G > 1 a.s.). However, to every point that we examine 
there is a positive probability that it is a terminal point. In this case, that integer 
does not connect to a new independent cycle of o, and hence its offspring is 0. This 
kind of modified branching process is defined more precisely and analyzed in Sec- 
tion 8.3. The key fact is that because of the special properties of the asymptotic law 
of x, which involves geometric random variables, this modified branching process 
is in fact equal in distribution to another branching process where the offspring 
distribution has been shifted from G to G — 1. In all that follows, we call X a 
random variable such that 


xd GT. 


458 N. BERESTYCKI 


Hence F has a giant component if, and only if, E(G) > 2. Since p = 1/(1 + b) 
and log(1 + b)/b — 1 — a, 


P(T has a giant component) >0 <=  a-1-1log2. 
PROOF OF THEOREM 8. 


8.1. Structure of the proof. As this proof is rather long, we feel that it is ap- 
propriate to explain how the various arguments are used. In Section 8.2, we prove 
that W (C1 (T)) > «9 Z: the total progeny of a branching process with offspring 
distribution X. Then in Section 8.3, we define a modified branching process, and 
prove that in the case of geometric random variables this becomes another branch- 
ing process with shifted offspring distribution. We then use this to prove by hand 
that in the subcritical case, |C1 (o - 2)| is dominated by such a modified branching 
process. Since this is a subcritical branching process, we prove the exponential 
decay of the tail of |C: (o - x)|, uniformly in n. This enables us to show that 
as long as a < 1 — log2 there are very few fragmentation events in the walk 
(o,r — 0, ..., an). The supercritical case is treated in Section 8.5. Since we have 
established branching process asymptotics, we can use the duality principle of a 
branching process between the subcritical phase and the supercritical phase. This 
shows that the number of clusters of I in the supercritical regime can be computed 
by looking at the number of clusters of I' for some specific subcritical time. Since 
we have proved that the distance is linear in this regime, we now know how many 
clusters I has at any subcritical time, and it follows that the number of clusters 
of I in the supercritical regime is strictly less than what it would be if the distance 
was still linear. It only remains to prove that at any given time the number of extra 
cycles that were generated by some fragmentation (and have not been reabsorbed 
by other large cycles) is O (nV?), which is done in Section 8.6. 


8.2. Branching process asymptotics. To start proving things, we need some 
more notation. Let Aj = {1} and define recursively the A; by 


Aya = Uf {Cx(x)(0)} — U Aj. 


xeA; l<j<k 


The Aj correspond to growing the branching process generation after generation, 
rather than cycle after cycle. Let (Z,;,t = 0,1,...) be a branching process with 
offspring distributed as X. Note that by the construction of I’, we also have that 


oo 


SAE = W(3n). 


k=0 


LEMMA 10. Asn — oo, 
(ASI, IATI, ...) => (Zo. Z1. ...). 


GEOMETRY OF RANDOM TRANSPOSITIONS 459 


PROOF. Let us start by the convergence of (Al, |A1 D. If j =0, P (A9! = 1, 
|A1| 20) = P(x (1) 2 D) > 1/0 +b) := p= P(X — 0). If j = 1, then 


P (x (1) x 1 [Cra (o)| = j) 
P (x (1) Z 1): P(Ixq(o)| = jlx) ¥ 1) 
> (1- p): Q- p ™'p =P =j). 
Indeed, conditionally on {7(1) 4 1), z(1) is uniform on (2,...,n), so that 
Cza) (0) is as good a size-biased pick as C; (c), and we can apply Theorem 3. 
Now let us consider the general case of finite-dimensional distributions. Let 


nj > 0,n2 >0,... nk > 0 with »7; nj x n. We are trying to compute the asymp- 
totics of 


P(|Agl = 1, ATI = j) 


I 


li 


P (|Ao |= 1, |A1 | 2nj,..., | Az| = ng). 


To do this, we need to evaluate the probability of a collision occurring in the first 
k stages, that is, 


P(a € U A; — Cx Gr) for some x € A5 with j < jJ 


1<i<k 


We will say of an x such as in the event above, that it makes a backward con- 
nection. Hence an x makes a backward connection if zr (x) maps it to some lower 
level in the branching process, but x is not a terminal point. Therefore backward 
connections (or collisions) are exactly those that may lead to a fragmentation, as 
explained in the sketch of the proof. 

It is easy to see that P (collision in first k stages) = O(1/n). In fact, it follows 
from the uniformity of Lemma 11 that 


: DON 


P(|Ag] = no, ..., |Ag| = ny; b.w. collisions in k first stages) < 3 nj 
j=l ý 
c 2 nj J“ 
7 n 
(see also Lemma 12 where similar estimates are derived). 
Therefore it is enough to consider 


P(\A6| = 1, |Aq| 2 n1, ..., 14%] = ni] given no b.w. connection). 


Suppose Azı = {X1,...,Xn,_,}. Conditionally on the event that there is no col- 
lision in the & first stages, w(x), ..., 7 (xy, ) belong to yet unexplored cycles 
(as long as they are not terminal). After decomposition on the number of such x 


460 N. BERESTYCKI 


[call T(A7. ,) the number of terminal elements in the set Aj. 1], the last probability 
is equal to 


nk—1—J 


= 2. P( »» IG (0) em 2 P(T(Az , — j| no b.w. conn.)) 
j=l 


ist 


Rk—1—j 
»x Y xn) 


j20 i=] 
Rk] 
= o> Xi =n] 
i=] 


by the asymptotic independence property of a finite number of size-biased cycles, 
and the fact that given there was no backward connection, the x in level Ag] 
belong to different cycles of 77, so that the events that they are terminal are inde- 
pendent asymptotically. 

These are the transition probabilities of a branching process with offspring dis- 
tribution X, so the lemma is proved. O 


8.3. A modified branching process. One way to formalize the idea that a ver- 
tex has a geometric number of children only during finitely many generations, is 
to use a modified branching process where each individual x is endowed with a 
nonnegative, integer-valued random variable T (x), that represents the “life-time” 
of its family. As long as T(x) > 0, x will keep having children according to the 
original offspring progeny L. But when T (x) = 0, the individual will be declared 
“terminal” and will not be allowed to have any children. 

Here is a rigorous description of this modified branching process. Let X, ; be a 
collection of i.i.d. random variables with distribution L, a fixed distribution on the 
nonnegative integers (the original progeny). Let T; ; be i.i.d. nonnegative integer- 
valued random variables, distributed according to another distribution L’, the life- 
time. Let Z; be the size of the process at time t (with discrete time). Define Zo = 1, 
and give the root lifetime 7,9. Then define recursively Z; by 


Zi 
(8.2) Zii = X Xni UT æ)>0) 
i=0 
where x;,...,xz, are the Z; individuals of generation f. If y1,..., YZ, are 


the Z,,, individuals of generation ¢ + 1, the rule that we adopt for the value 
of T(yi), ..., T(yz, 4) is the following. If T(x;) > 0, give all X;; children 
of x; independent lifetimes from 7; ;, except for one of its children, say yj, for 
which T (yj) :— T (xj) — 1. Rigorously, let N; —it(i : T (xj) > 0], rewrite the x;’s 
removing the terminal ones and call them xj,...,xy,. Let T(y1) = T(x}) — 


GEOMETRY OF RANDOM TRANSPOSITIONS 461 


1,..., TON) = Ty) — 1, and let TOw,41) = Tra Ntb S TOZ) = 
DI man 

Of course we make this definition because asymptotically, W (Cı (T)) will be 
well approximated by such a system, where the offspring L is the size of a cycle 
of c, and where L’ is the size of a cycle of x . Indeed, suppose we are exploring the 
cluster containing 1 in the graph of the superposition of o and x. T (1) is |C1(z)], 
which corresponds to the fact that after that many iterations of x we are back to 
where we started and no longer add anything new to the population of the cluster. 
However, after one iteration say, all vertices in the first generation, other than zr (1) 
itself, belong to different cycles of x with high probability. Therefore their lifetime 
should be an independent random variable, distributed as L’. 

In general, the ageing branching process (Z;,t = 0,1,...), where each indi- 
vidual has a "lifetime" that it transmits to one of its children, is not a Markov 
process with respect to its own filtration o (Zo, Z1, ...). Indeed the size of the gen- 
eration ¢ + 1 depends not only on the size of generation 7, but also on the random 
variables Ty where x is an individual of generation ¢, so one would need to add in 
the filtration the values of T (x) for each generation. 

However, a miracle happens due to the fact that the cycles of x have (asymp- 
totically) a geometric distribution G. Let p’ be the parameter of G: P(G = j) = 
(1— p^)/-! p'. Then the distribution L’ of the random variables T, i is again G. For 
k > 1, conditionally on T > k, T — k is distributed as G. This fact, called “lack of 
memory,” has the following amazing consequence: 


PROPOSITION 1. When the lifetime L' is a geometric random variable, 
(Zi, t = 0,...) is a Markovian branching process with offspring distribution 


Llig»): When LÊ L' ÈG, this distribution is X ÈG — 1. 


PROOF. Let B,; be Bernoulli random variables with success parameter 
P(B, = 1) = p'. Because G =4 inf(t > 0; B, j = 1}, the event (T (xj) > 0} is 
the same as {B; ; = 0}, so (8.2) becomes 

Zi 
(8.3) Zia = J Xii, 0) 
i=0 
This expresses the fact that for each new vertex visited, we can take the decision of 
closing the cycle, independently of the past. When the cycle still has some length 
to be explored, then the vertex has X; ; children. This decision affects the law of 
progeny at a given vertex. The new distribution of the progeny is now, by (8.3): 
: if j —0 
8.4 P(X = j)= | P v dm 
em (fap (1—p)P(Xni-j)  ifjzl 
Of course, for our problem, o =g zt, so both L and L’ are distributed as G. As can 
be readily checked from (8.4), the distribution of X is thus a shift of G: 


(8.5) xu (Ge " 


462 N. BERESTYCKI 


8.4. Fragmentations in the subcritical case. First note that if o is a uniform 
permutation on à B(an), if we visit all points in (1, ...,} according to their order 
of appearance in the canonical cyclic decomposition of o, and call this process 
V (0 x t < n), then the successive points are in some sense uniformly chosen from 
what remains to be found, at least as long as we do not have to start a new cycle. 
More precisely: 


LEMMA 11. Given (Vo, ..., Vi) and given that V; is not a terminal point, 


o (Vi) is uniform on (1,..., n] — (Vo, ..., Vi]. 


The proof of this lemma follows directly from the Feller coupling presentation 
of a uniform permutation on Ga, which also has (obviously) this property. Condi- 
tioning on the number of cycles does not change how the cycles are filled in. 


LEMMA 12. Suppose the branching process is subcritical, that is, p > 1/2 
or (equivalently) a < 1 — log2. Then the number of fragmentations in the walk 
(Oy, r —0,...,n) is o(n). 


Basically, all cycles are fairly small, so by improving our estimates on the num- 
ber of collisions, we should get an O (1) bound, just like in the Erdós-Renyi case. 
Technicality arises due to the fact that the cycles are conditioned independent ran- 
dom variables, and not just independent. Here is a rigorous proof. 


PROOF OF LEMMA 12. We prove things in two steps. 
First, we prove a uniform bound for the size of a cluster: we show that if we 
write 7 = [ [F2] v; and denote x, := [ [i=] vi. 


(8.6) P(|Ci(o -z,)| > u for some r < an) < Cnexp(—au), 


where C and o are constants independent of n, and u is any number. 

Once this exponential control is proved, we can bound the number of times that 
one of the t,’s will yield a fragmentation. Indeed, recall that to obtain o - z we 
can perform successively the t,’s on o, and each one yields a coagulation or a 
fragmentation. We hence view this as a process indexed by 1 <r < an. In the 
course of this process, at all times, by (8.6) applied to u = (logn)*, no cluster is 
larger than (logn)? with overwhelming probability, so that by Lemma 11: 


P (1.4.1 yields a fragmentation) < 2(log n)" /n. 
There are (exactly) an transpositions to perform, hence: 
E(#frag.) < 2a(logn)?. 


This is already largely enough to prove Lemma 12. 


GEOMETRY OF RANDOM TRANSPOSITIONS 463 


We will now prove that (8.6) holds, since this is the only thing that remains to 
be proved. Although we have seen that in the limit each cluster is a subcritical 
branching process (for which such an exponential tail of the total progeny holds), 
when n is finite there is no real branching process available to dominate C (o : 775), 
essentially because the sizes of the cycles are not i.i.d. random variables. However, 
they are conditionally independent (cf. Theorem 9, or Theorem 3), and we will use 
this fact to construct a real branching process that dominates Cj(o - 7+), when 
conditioned on some mild event. This conditioning accounts for the extra factor n 
in (8.6). 

Here 1s how we proceed. By Kolchin's representation theorem (Theorem 9), 
there are random variables (X1,..., Xn(i—a)) such that the joint law of the sizes 
of the cycles of ø is (X1,..., X4(1-4)) given >); X; =n (we will call A, the event 
that 57; X; =n). The X;'s constitute a “pool” of possible cycle sizes. Similarly, 
there are random variables (Y1,..., Yn(1—a)) such that the joint Jaw of the sizes 
of the cycles of x is (Y1,..., Ynqi—a)) given 5? ;; Yj =n (let B, be the event that 
X; Y; =n). 

We give an upper bound of @;(o - x) in terms of the modified branching 
processes of Section 8.3, that uses only the X;'s and the Y;'s. Start with ver- 
tex 1 and choose a size-biased pick X} of the X;'s (the cycle containing 1). 
Put T (1) = Yj, a size-biased pick of the Y;'s. Next, given X = k, put 7T (2) = 
Y?, ..., T(k) = Y,. All vertices with positive lifetime T have a number of children 
given by a size-biased pick of the remaining X;'s. They transmit their lifetime —1 
to one of their children and the rest have lifetimes given by size-biased picks from 
the remaining Y;’s. Then repeat the procedure until we cannot go any further (1.e., 
until all vertices at a given generation have lifetime T = 0, or until all X;'s and Y;'s 
have been picked). Call Z’ the total population obtained at the end of this construc- 
tion. 

We claim that Z’ dominates all stages of C1(o - 71), because Z’ gives the cycles 
of ø coagulated by those of x, without taking any account of eventual fragmen- 
tations. In particular, in Z', as long as a vertex x is not terminal (T (x) > 0), the 
children of x will be part of the population of Z’. Of course in the event of a colli- 
sion or a backward connection, Z does not contain any additional children, so that 
Z < Z’. Therefore 


P(Z » u) x P(Z' > u|A, and Bn) 
< P(An) ^ P(Bs) | P(Z' > u; An; By) 
<CnP(Z >u). 
Indeed, by the Iocal central limit theorem (see [5]), 


n(1—a) 
P(A,) = P(B,) = P( ~~ Xi= n) ~ Cn i. 
= 


464 N. BERESTYCKI 


To complete the proof, it remains to notice that size-biasing the logarithmic dis- 
tribution of Kolchin's theorem gives a geometric random variable. Therefore, by 
arguments already developed in the sketch of the proof, Z' is the total population 
of a branching process with offspring distribution X of (8.5), and starting with a 
geometric number of individuals G. Because p > 1/2, this branching process is 
subcritical. In this case, classical estimates [2, 5] show the exponential tail 


P(Z' > u) x Cexp(-au). 
This concludes the proof of (8.6), and also that of Lemma 12. O 


REMARK. Itis possible to avoid the use of Kolchin's representation theorem 
in the above proof. Indeed, by Theorem 3, a size-biased pick of the cycles has, 
after unconditioning on some event of probability œ n^ 2, a distribution which is 
given by the lengths of sequences of 1 in the Bernoulli trials EU. However, since 


Bi =P (ero = 1) < b/(1 +b), it follows that the distribution of a size-biased cycle 
is thus (after unconditioning) stochastically dominated by the geometric random 
variable G. 


8.5. Mean in the supercritical regime and duality. Although this may seem a 
little surprising at first, we use the result from the subcritical case to get that for 
the supercritical case. The idea is to use the duality of branching processes. 

A crucial remark is that the number of cycles of o -x is given by $7.4 1/ 
lC; (co - x)|, hence by exchangeability 


1 1 1 
= Æ Gfclusters of T) = Em) — E( Lr > 1), 


where T is the total progeny of a branching process with offspring distributed as X 
and started with one individual. Indeed, let us not forget that the first generation Ao 
of the branching process is itself a geometric random variable, so we can add an 
imaginary root and then subtract it (thus T — 1). Introducing an extra vertex for the 
root allows us to make use of the duality principle of branching processes [2, 6]. 

The duality principle states that a supercritical branching process, conditioned 
on extinction, is another branching process, subcritical, whose offspring distrib- 
ution is given through its generating function. If ¢ (s) = E(s*) is the generating 
function of X and a is the extinction probability œ = P(T < oo), then the condi- 
tioned process has offspring distribution characterized by 


$' (s) = ó(so)/a. 
Here, P(X = j) = (1 — p)! p, so 


$() = ——. 


1—s(1—p) 


GEOMETRY OF RANDOM TRANSPOSITIONS 465 


Therefore 
| naL P/a 
eX — 1—5so(1— p) 
The fixed point equation for œ yields that 
(8.7) = Hed pad 


so that $' is the Laplace transform of another shifted geometric random vari- 
able X', with parameter p' = p/a. Let T’ be the total progeny of a branching 
process with offspring X’ and started with one individual. 

Let us now relate the supercritical and subcritical regimes. By duality, 


1 P(T 1 
E(=—Ir = 1) = en E reu[r m 00) 


~ P(T>1) \T-1 
P(T « oo) 1 ) 
A 7. E 
P(T > 1) (5-3 UH 
_ P(T <œ) 





/ I / 
ub re Li DE(s —|r' > 1) 
However, for the subcritical regime, we know by Lemma 12 that there are 
only o(n) fragmentations, so the distance between o and x is 2an + o(n) and the 
number of clusters is (1 — 2a)n + o(n). Hence E Gy IT” > 1) = 1 — 2a’, where 
a’ is the radius corresponding to the conditioned parameter p’. Since p = 1/(1 +b) 
and log(1 + b)/b = 1 — a, we have that 





On the other hand, due to the fixed point equation (8.7), the constant P(T’ > 
D/P(T > 1) «(0 — p’)/(1 — p) simplifies into a. 
Therefore, Theorem 8 is proved when we show that 
a? (1 — 2a’) > 1 — 2a. 
Using the fixed point equation (8.7), we find that a’ = 1 + p log p’/ ((a*)(1 — p)), 
so that the above reduces to 


I / 
plogp' — 


—og? —2 | 5 PlogP 


=p =p 
or 
2pl 
cu a? 
1—p 
Using one more time the fixed point equation, one gets 


— 1. 


2ploga o — 1. 


466 N. BERESTYCKI 


Since log(1 — x) > —x, itis therefore enough to show 
—Ap(l—o)-o«-—1lor2p«1 


which is precisely the condition that the branching process is supercritical. 


8.6. Fragmentations in the supercritical range. In the previous section we 
have computed asymptotics for the expected number of clusters in the graph result- 
ing from the superposition of the cycle structures of o and zz. We now need to show 
that at the end of the walk (o,, r = 0,...,an), there are no more than o(n) addi- 
tional cycles that have been generated by fragmentation, compared to the number 
of clusters of I. 

To do this, we use once again the dynamic point of view adopted to deal with 
the subcritical regime. Let t1, ..., Tan be the decomposition of z in product of an 
transpositions as evoked earlier, and let o, — o - t1... Tp. 


LEMMA 13. Foreach l xr < an, the expected number of cycles in o, gener- 
ated by fragmentation is O (n!/?). 


This is similar to the Erdós-Renyi case of Berestycki and Durrett [3], The- 
orem 3. Lemma 13 does not claim that the number of fragmentations itself 
is O(n'/2), but that the number of extra cycles generated by fragmentation 
is O (n^). Just like in the Erdós-Renyi case, many of the cycles that are frag- 
mented get reabsorbed by large components fairly quickly. 


PROOF OF LEMMA 13. There can never be more than n7? cycles of size 
larger than n!/?. On the other hand, by Lemma 11, the probability that t; will 
create a fragment of size smaller than n!/? is at most n'/?/n = n^ V. Therefore 
the expected number of such fragmentations is at most an - n- V? = Q(nl). O 


At this point, Theorem 8 is proved. Q 


Acknowledgments. [am very grateful to Laurent Saloff-Coste for introducing 
me to Gromov hyperbolic spaces and to Jim Pitman for pointing out the reference 
to Kolchin's representation theorem. I wish to thank Rick Durrett particularly for 
his help at several important stages of this work and for his support during my stay 
at Cornel] University. 


REFERENCES 


[1] ARRATIA, R., BARBOUR, A. D. and TAVARÉ, S. (2003). Logarithmic Combinatorial Struc- 
tures: A Probabilistic Approach. European Mathematical Society Publishing House, 
Zürich. MR2032426 

[2] ATHREYA, K. B. and NEY, A. (1972). Branching Processes. Springer, New York. MRO0373040 


GEOMETRY OF RANDOM TRANSPOSITIONS 467 


[3] BERESTYCKI, N. and DURRETT, R. (2006). A phase transition in the random transposition 
random walk. Probab. Theory Related Fields. To appear. 
[4] DEMBO, A. and ZEITOUNI, O. (1993). Large Deviations Techniques and Applications. Jones 
and Bartlett, Boston, MA. MR1202429 
[5] DURRETT, R. (2004). Probability: Theory and Examples, 3rd ed. Duxbury Press, Belmont, 
CA. MR1609153 
[6] DURRETT, R. (2005). Random Graphs. In preparation. 
[7] GROMOV, M. (1987). Hyperbolic groups. Math. Sci. Res. Inst. Publ. 8 75—263. MR0919829 
[8] JANSON, S., LUCZAK, T. and RUCZINSKI, A. (2000). Random Graphs. Wiley, New York 
MR1782847 
[9] PITMAN, J. (2002). Poisson-Dirichlet and GEM invariant distributions for split-and- 
merge transformations of an interval partition. Combin. Probab. Comput. 11 501-514. 
MR1930355 
[10] PITMAN, J. (2006). Combinatorial Stochastic Processes. Ecole d'Été de Probabilités de Saint- 
Flour XXXII. Lecture Notes in Math. 1875. Springer, Berlin. 
[11] PRAT, J. J. (1971). Étude asymptotique du mouvement Brownien sur une variété Riemanienne 
à courbure négative. C. R. Acad. Sci. Sér A-B 272 A1586-A1589. MR0279889 
[12] SCHRAMM, O. (2005). Composition of random transpositions. Israel J. Math. 147 221—243. 
MR2166362 


DEPARTMENT OF MATHEMATICS 
UNIVERSITY OF BRITISH COLUMBIA 
1984 MATHEMATICS ROAD—ROOM 121 
VANCOUVER, BRITISH COLUMBIA 
CANADA V6T 1Z2 

E-MAIL: nberestycki ? math.ubc.ca 


The Annals of Probability 

2006, Vol. 34, No. 2, 468-492 

DOI: 10.1214/009117905000000639 

© Institute of Mathematical Statistics, 2006 


ASYMPTOTIC LAWS FOR COMPOSITIONS DERIVED 
FROM TRANSFORMED SUBORDINATORS! 


BY ALEXANDER GNEDIN, JIM PITMAN AND MARC YOR 
Utrecht University, University of California and University of Paris VI 


A random composition of n appears when the points of a random closed 
set R C [0, 1] are used to separate into blocks n points sampled from the 
uniform distribution. We study the number of parts K,, of this composition 
and other related functionals under the assumption that R= $(S,), where 
(S;,¢ > 0) is a subordinator and $:[0, co] — [0,1] is a diffeomorphism. 
We derive the asymptotics of K, when the Lévy measure of the subordi- 
nator is regularly varying at 0 with positive index. Specializing to the case of 
exponential function $ (x) = 1 — e~*, we establish a connection between the 
asymptotics of K, and the exponential functional of the subordinator. 


1. Introduction. A composition of n with positive integer parts may be repre- 
sented by a configuration of stars separated by bars, for instance, * * +| k | cook [ok 
encodes the composition (3, 2, 4, 1) with weight 10, length 4 and four parts 3, 2, 4, 
1. A stochastic analogue of this construction appears when we assume the points 
of a closed random set R C [0, 1] in the role of bars, and n independent ran- 
dom points sampled from the uniform distribution on [0, 1] in the role of stars, 
see [11, 13, 16, 15, 25]. Given this data, we define an ordered partition of the set 
Us... Un} by assigning two points u; < uj to the same block if and only if uj 
and u; are not separated by R, meaning that RN [uj,u;] = 2. That is to say, 
uj € R forms a singleton block, and if u; and u; fall in the same gap (open in- 
terval component of [0, 1] \ &), then these points are assigned to the same block. 
A composition C, is defined to be the record of block sizes, ordered from left to 
right. Exchangeability in the infinite sample u1, u2, ... results in a simple consis- 
tency condition of C,'s as n varies, that is, the sequence (C,) is a composition 
structure in the sense of [11, 12, 16]. 

The model just described offers a general framework for a wide range of 
"species sampling" problems, as studied in statistics and population genetics. In 
these applications one postulates some idealized infinite population, randomly par- 
titioned into various species, with a total order on the set of species. A sample 
from such a population is understood as an exchangeable sequence of random 
variables (X ;), and a composition C, is defined as the record of multiplicities of 


Received March 2004; revised December 2004. 
I Supported in part by NSF Grant DMS-00-71448. 
AMS 2000 subject classifications. 60G09, 60CO05. 
Key words and phrases. Composition structure, regenerative set, sampling formulae, regular vari- 
ation. 


468 


ASYMPTOTIC LAWS FOR COMPOSITIONS 469 


distinct values represented among X1, ..., Xn, in the order of increase of the val- 
ues. Then (C,) is a composition structure and by a de Finetti-type result [11], it can 
be uniquely associated with some random closed set R C [0, 1], which appears as 
a way to uniformize the limiting empirical distribution of (X ;). 

Let K, be the length of C, (this variable may be interpreted as the number of 
distinct species in a sample). The growth properties of moments of K, are sensi- 
tive functions of the random set R. Logarithmic and power-like asymptotics of the 
moments are known in the case when & is derived by scaling the range of a sub- 
ordinator (S,,t € [0, T D, that is, increasing process with stationary independent 
increments restricted to a finite time interval [1, 25, 24, 26]. (See [3] for general 
background on subordinators.) 

In this paper we study asymptotic properties of K, for the random sets obtained 
by transforming the unrestricted range of a subordinator. Specifically, we consider 
R = $(S,), where (S;, t > 0) is a drift-free subordinator and $ : [0, co] > [0, 1] 
is a diffeomorhism. We assume the Lévy measure v of the subordinator to be 
regularly varying in the sense that 


(1) v[y, co] = £(1/y)y ^, y 40, 


where 0 < «a < 1 and the function £ is slowly varying at oo. We also consider 
the process K,,(t), the number of parts of the partial composition produced by 
the transformed subordinator restricted to the time interval [0, t]. Other quantities 
of interest are Kn , and Kn r(t), defined as the multiplicity of part r in C, and 
multiplicity in the partial composition, respectively. 

We show that, as n — oo, the length K, is asymptotic to a power-like, regularly 
varying function of n multiplied by a random factor L. The factor L is identified 
explicitly as an integral functional of the subordinator. Similar results also hold 
for Kn,r, Kn(t) and K,,, (f). The appearance of a random factor is due to variability 
in the gap sizes, as can be compared with a result by Karlin [20] which states that 
the number of distinct values in a large sample from arbitrary nonrandom discrete 
distribution is asymptotic to the mean number of such values. 

In the special case @(y) = 1 — e~”, the set R is a multiplicative subordinator 
and the composition C, inherits a characteristic regenerative property from this 
set [16, 14]. We show that L specializes in this case as the well-known exponential 
functional of subordinator. The distribution of L is then uniquely determined by the 
power moments which are given by a known formula reproved here by elementary 
tools in the case of subordinators. 

In the regenerative case, the distribution of K,, is well known for the composi- 
tion described by Ewens’ sampling formula, in which case Kp is of logarithmic 
growth [1, 25]. More generally, Gnedin [13] has previously shown that the loga- 
rithmic growth of K, is typical when the Lévy measure is finite. For compositions 
belonging to the two-parameter family [16, 21, 25], the proper format for Kpn is n? 
for parameters (o, 0) with 0 < œ < 1. Another interesting case is that of slow vari- 
ation, when the relation (1) holds with o = 0 and some £(1/y) exploding at 0. This 


470 A. GNEDIN, J. PITMAN AND M. YOR 


includes the gamma subordinators whose Lévy measure has a logarithmic singu- 
larity. This case is very different from the case of regular variation with positive 
index o and is being treated separately [2, 17]. 

We shall be assuming throughout that (1) holds, which entails that the Lévy 
measure is infinite. When the Lévy parameters (v, d) are multiplied by a positive 
factor c, the variables K,,, remain unchanged, but K,,,(t) should be replaced by 
Kn,r (t/c). Basically, we assume that the Lévy measure satisfies v{oo} = 0 and that 
the drift coefficient is O, unless explicitly stated. 

It should be mentioned that there are many other constructions of random com- 
positions, but typically these compositions are not consistent as n varies. One ob- 
vious possibility, in terms of the “stars and bars" representation, is to exploit the 
Bernoulli scheme, that is, to allocate a bar at each possible position with fixed 
probability p. (The particular choice p — 1/2 corresponds to the uniform distri- 
bution on the set of all compositions of weight n, see [18].) The expected length 
of such composition grows linearly with n, while, for composition structures, we 
have EK, = o(n), provided the Lebesgue measure of R is 0 (which means that 
the positive frequencies of distinct species sum to 1). See [15] for a complete char- 
acterization of the composition structures obtained by truncating a single infinite 
sequence of stars separated by bars at positions visited by some increasing random 
process on integers. 

The rest of the paper is organized as follows. In the next three sections we mod- 
ify Karlin’s results on occupancy problems, we provide some analysis of the gap 
counts necessary to apply these results to the composition derived by a general 
transform of subordinator and we formulate the strong laws for K, and the like. 
We specialize then to multiplicative subordinators in Section 5. In Section 6 we 
continue to consider the regenerative case, but replace fixed-n sample by a Pois- 
son point process, we then analyze recursions for the moments of the length of 
poissonised composition and show the convergence of the scaled moments of Ky. 


2. General strong laws. Karlin [20] studied the number of different types 
represented in a sample from a fixed discrete distribution with infinitely many 
positive masses. His results open a clear path to the strong laws for K, and Ky,-. 
Let & be an arbitrary closed subset of [0, 1] with zero Lebesgue measure. Let 
Q, be the composition derived from R by separating uniform points. Conditionally 
given R, the number of parts of C, is the same as the number of different types 
represented in a sample from the discrete distribution with masses equal to the 
gap-sizes. Therefore, by [20], Theorem 8, as n — co, 


(2) Kn~E(Knl®), — Knr ~E(Karl®), — r= 1, 


where ~ means that the ratio converges to 1 almost surely. For x > 0, let N, be 
the number of gaps of R of size at least x. The following is a variation of [20], 
Theorem 1, equation (23) and page 396. See also [25], Lemma 34. 


ASYMPTOTIC LAWS FOR COMPOSITIONS 471 


THEOREM 2.1. Let £ be a positive slowly varying function and L a nonnega- 
tive random variable. The convergence 


P 


Nx 
£(1/x)x 2 


with 0 <a «1 — for n — oo, 


>L a.s., x |, O, 


Kn,r — aT (r —@) 5 














l'(1—o)L, 
ond ] NOD n*£(n) r! 
almost surely, and the same convergence with œ = 1 implies 
K 
An > L, Fn, > L, LE L forr -1 
n£* (n) n£* (n) n£(n) r(r — 1) 


almost surely. Here £* is another function of slow variation at oo, defined by the 
converging integral 





oo pe—1/y 
3) e*(t) = J E: e(ty) dy. 


PROOF. Let us start with K,. By (2), it is sufficient to determine the asymp- 
totics of conditional expectation. To this end, we introduce a random measure y 
on ]0, 1] by defining its tail 


y (x) = yx, 1]2N x) x €]0O, 1], 


to be the number of gaps of R of size at least x. The measure y is atomic and 
assigns to each x € ]0, 1] an integer weight equal to the number of gaps of R of 
length x. For a particular gap of length x, the probability that at least one of n 
uniform sample points hits this gap is 1 — (1 — x)", so 


" 1 l - 
& EX -[ Q-a-2"ya9-n[ A-7 704. 


where the second equality is obtained by integration by parts. Observe that the 
formula 


1 1 
(5) | Pedx= [ xy@n=1 


simply says that the total length of gaps equals 1, thus, the measure y (x) dx, 
x € [0, 1], is a probability measure with nonincreasing density, which takes only 
nonnegative integer values. In the last integral in (4) we recognize a Mellin trans- 
form and standard Abel-Tauberian arguments (see Appendix) imply that, for 
O<a<l, 


yG)ex "£/x)L = forx {0 iff 


1 
nf (1— xy'-1y (x) dx ~T(1 — a)n*£(n)L for n — oo 


472 A. GNEDIN, J. PITMAN AND M. YOR 


and the result follows in this case. In the case a = 1, the Mellin integral is asymp- 
totic to the Laplace integral 


OQ 
Í e "*x l(/x)dx, 


which converges due to (5), and becomes (3) upon substituting nx — 1/y. The 
slow variation claim for Z* is Lemma 4 in [20]. 
For K, ,, we have a similar integral representation 


" 1 
© EKn = (7) [ a-ya, 


which is obtained by a formal binomial expansion of 1 — (1 — x)". The formula fol- 
lows by observing that a gap of length x is hit by exactly r sample points with prob- 
ability (7)x" (1 — x)". A Tauberian argument applied to the measure x" y (dx) 
yields 


EGG, 15) ~ C707 (X ) BKB, 
which ends the proof. Ll 


REMARKS. For the two slowly varying functions in the theorem, we have 
£(t)*/£(t) — oo as t — oo (formula (13) in (20] is misprinted), that is, for 
£(t) = (logt), u > 1, we have £*(t) ~ (u — 1)-1 (log t)!-*. See [6], Chapter 3, 
for results involving two slowly varying functions like £ and £*. The relation 
between asymptotics of (4) and (6) is an instance of "smooth variation" proper- 
ties [6], Section 1.8 of the Bernstein function defined by (4). 


Under the assumption of Theorem 2.1, the conditional distribution of Kn 
given R approaches a normal distribution as n — oo, by [20], Theorem 4 (also 
see [8], Theorem 2). Karlin's results also imply a multivariate normal limit for the 
conditional distribution of the sequence (K; ., r = 1). 


3. Counting the gaps. Let (S;,¢ > 0) be a subordinator with the drift co- 
efficient d = 0 and a Lévy measure v satisfying v(oo] = 0. The jumps of (5;) 
correspond to the gaps of & C [0, oo], which is a topological Cantor set provided 
the Lévy measure is infinite. 

Let @:[0, co[ — [0, 1[ be a diffeomorphism, that is, a continuously differen- 
tiable function satisfying $ (0) = 0, $ (co—) = 1 and $'(t) > 0. For R= HCR) C 
[0, 1], the gaps comprising RE = [0, 1] V R correspond to the jumps of the sub- 
ordinator transformed by 6$. Let N, (t) be the number of jumps of size at least x 
for the transformed subordinator restricted to [0, t], and let A, = N,(oo) be the 
number of such gaps of R without restriction. We are interested in the asymptotics 


ASYMPTOTIC LAWS FOR COMPOSITIONS 473 


of these gap counts for small x. A similar analysis has appeared in [23] in the case 
of stable subordinators. 

The analogous question for the original subordinator (5,) is easy. Let N, (1) be 
the number of gaps of R of size at least y, generated by the subordinator restricted 
to [0, t]. The counting process (Ny (£), t > 0) is a Poisson process with rate v(y), 
thus, for small y, the behavior of this process is ruled by the strong law of large 
numbers: 


(7) Ny@)~vQy)t,  y40, 


almost surely for all t. We shall see that translating this behavior into similar results 
for N,(t) and N, = N,(oo) amounts to a change of variable formula which was 
stated in [23], albeit under different assumptions on @. 

Speaking more broadly, we may wonder about the conditions on $ and v which 
imply an asymptotic relation analogous to (7) of the type 


(8) NO ~ YLE), x0, 


where yw is a scaling function and (L(t), t € [0, oo]) is a positive random process. 
A principal new effect appearing in (8), as compared to (7), is that a nonlinear 
transformation of subordinator leads to a genuinely random scaling limit. The 
next question to ask is whether such a relation holds with some L for t = co and 
whether L(oo—) = L, and we shall find conditions when this is true. 

For a Lévy measure as in (1), we shall use the scaling function 


W(x) m x "£0/x). 


3.1. Finite t formula. Let Ny(t;, t2) be the number of jumps of ($, t € [t1, t2) 
of size at least y, with the convention that this is zero for fj > fo. 


THEOREM 3.1. Jfthe Lévy measure satisfies (1), then for each diffeomorphism 
$ : [O, co] > [0, 1| and 0 < t < oo, the convergence, as x — 0, 


MT I 
(9) NR. OYE) > | (#" (Sy) du 


holds in the mean for each t, as well as almost surely, uniformly in t bounded away 
from oo. 


PROOF. Consider a partition of [0, t] by points 0 = tg < f1 <---<t = t; with 
probability 1, each f£; is a continuity point of the subordinator. As is easily seen, 
Nxjgr($ js tj+1) S Nx (tj, tj+1) € Nor (ss tj Lj41); 
where £; and n; are the points where $' attains the maximum and the minimum 


on [S;,, 5;,,,], respectively. Taken together with (7), this implies 


(10) @j41- Cer) K N (tj, tji) K (tji — css). 
J j 


474 A. GNEDIN, J. PITMAN AND M. YOR 


where the notation X < Y for positive random quantities depending on x means 
that ess sup X/Y < 1 for x | 0. From this and the assumption on v, 


k—1 k—1 
VG) » (9 (5,,)) Gà — t) « Ne) « VG) YO" (5) (ja — £5. 
j=0 j=0 

We see that N,(t), for x | 0, can be squeezed between an upper and a lower 
Riemann sum; thus, sending the diameter of partition to zero and using the con- 
tinuity, we obtain the almost sure convergence (9). Using the obvious bound 
Ny(t) € Nx/maxg' (t) where the maximum is taken over [0, t], the convergence 
in mean follows by dominated convergence. U 


There is a minor generalization of the formula for integrals with random up- 
per bound. By stopping time v, we understand a random variable taking values 
in [O, co] and such that ((v ^ t), t > 0) is adapted to the natural filtration of the 
subordinator. 


COROLLARY 3.2. If the Lévy measure satisfies (1), then for each diffeo- 
morphism $ : [0, col — [0, 1[ with sup ¢’ < oo, and each stopping time t with 
Er < œ, the convergence, as x — 0, 


S.C VG) > Í (9 (S)* du 


holds almost surely and in the mean. 


PROOF. The almost sure convergence follows from Theorem 3.1. The conver- 
gence in the mean is a consequence of 


lim lim sup EN; (t, v)/ Vr (x) — 0, 
t£— 00 x10 


which, in turn, follows from 
x 


y )EG — f)4. 


EN, (t, v) < EN, /sypo/(f, T) = EN, / sup ¢!(T —t), = ( 
by application of Wald’s identity, 
ENy(t) = v(y)Er, 
and the fact that E(t — £).. — O fort — oo. L 
The corollary can be applied to a subordinator killed at an independent time T. 


For example, when v{oo} > 0, the subordinator jumps to infinity at an independent 
exponential time. 


ASYMPTOTIC LAWS FOR COMPOSITIONS 475 


3.2. Full range formula. We turn next to finding some conditions for the con- 
vergence 


(11) S. hr) > ji (9 (Sy))% du, 


in which case (8) holds for t = co with L = lim;too L(t) given by the integral 
in (11). One condition which seems very natural is the integrability 


(12) E J (GS dt « oo. 


Granted this integrability condition only, we failed to prove or disprove whether 
the convergence holds in full generality, nor is there an obvious sufficient condi- 
tion which would cover the cases of interest including slowly varying functions 
|log y| or 1/|log y| and diffeomorphisms with exponentially decaying or power- 
like tails. 

To secure the convergence, we shall make some additional assumptions about 
$ and v. The analysis is largely simplified by further assuming that the deriv- 
ative $' is a decreasing function; in this case, in (10), we can set & j =t; and 
nj = tj41, and then, for any 1; < t2, conditioning on 5;, yields the inequality 


"P X 
(13) ER, (6,52) « (5 —t Eos) 


which is valid for each x > 0. 


THEOREM 3.3. Assume that the Lévy measure is as in (1), the diffeomor- 
phism $ has decreasing derivative, the integrability condition (12) holds, and one 
of the following single or composite conditions is satisfied: 


(i) for some constants a > 0, C > 0, the inequality £(u) < C£(v) holds for 
u < av, provided u and v are sufficiently large; 
(ii) there exist functions q and r such that: 


(iia) q(y) = oQr (y) as y 40; 

(ub) for some constants a > 0, C > 0, the inequalities r(1/v)v < u < av 
imply £(u) « C£(v) for all sufficiently large u, v; 

(ic) v(x/r(x))$ ^ (1 — 40)) 2 o( Q)) as y 40; 


(iit) £ is bounded away from 0 and co on every compact subset of [0, oo[ and a 
stronger integrability condition holds, with a in (12) replaced by a — 8, for some 
ô > 0; 

(iv) the same integrability condition holds as in (iii) and there exists a func- 
tion q which satisfies (ia), as well as pT (1 — q(y)) = o(W(y)) for y | 0. 


Then the convergence (11) holds almost surely and in the mean. 


476 A. GNEDIN, J. PITMAN AND M. YOR 


PROOF. In view of liminf Ny /V(x) > L(co—), it is sufficient to establish the 
convergence of expectations 


lim ER, /W(x) =E | - (d (S)? dt. 
x 0 


Thanks to both convergence results, 


H oo 
lim E J (9 (S))* du — E ji (9 (Sy))" du < oo, 


lim EN (0/60) =E | TO du « oo, 
x—Ü 0 


and in view of N, = N, (0, t) + N x(t, co), we only need to show that 
lim lim sup EN, (t, oo)/v (x) — 0. 
PRO x0 


Using monotonicity, 


oO oo 
EN, (t, oo) = 3 EN,(t - j,t-- j-1) « Y E ( 3 — ] ) 
= jag WPS) 


=E (Gu (9 Si), 


j=0 
whence 
Nelt, oo) re naf LO rcx) 
(14) Es «EY e Gur ( TE), 


j=0 
The rest of the proof amounts to estimating the right-hand side of this formula. 

Let Ty be the first passage time over the level ($^) * (x), when $’(S;) drops be- 
low x. As a consequence of Corollary 3.2, the sum of terms in (14) with t+ j < Ta 
is negligible, for each fixed a > 0. And assuming (i), we estimate the contribution 
of terms in (14) with t + j > tg by 


ce f O (d (S)? du, 
t—1 


which vanishes for t — oo due to (12). 

Assume (it). Let o; be the first passage time over the level $ * (2). By (iia), the 
contribution of terms with t + j > oj~g x) is negligible because at most q(x)/x 
gaps longer than x can fit into an interval of size q(x). The contribution of terms 
With f + j < Trx) also vanishes, as x | O and then 1 — oo, for the same reason as 
under the condition (i). Finally, 


ER, (tx, 0) < elas) (ox — )«) 


2 5( 5) Eoo = o(x^* £(1/x)) 


ASYMPTOTIC LAWS FOR COMPOSITIONS 477 


by virtue of (iic), because the expected time for the subordinator to pass a high 
level y = $ * (1 —q (x)) can be estimated from the above by by, with some positive 
constant b. 

Suppose (iii) holds. By Potter's Theorem A.3(ii), there exists C > 1 such that 
£(u)/£(v) < C(u/v)~ for all u < v. We apply this with u = ’(S;)/x and v = 1/x 
and then use the integrability with the exponent a — ô. 

Under assumption (iv), we make use of another part of the same Theorem A.3(1), 
which guarantees the same inequality for sufficiently large parameters, say, u > A, 
v > A. For £($'(S,,. j)) with ż + j < tax we have then the same inequality as 
in (iii), thus, the contribution of such terms can be estimated as in the case (1), but 
with the exponent o — ô. The remaining sum is bounded from above by 


V(A)Eo..6(x) 
analogously to the case (ii). C 


The integrability condition (12) can be ensured by means of the following 
lemma found in [3], page 28. 


LEMMA 3.4. For each decreasing positive function g on [O, oo[ and each 
subordinator (S;, t > 0), the following properties are equivalent: 


eo oo 
| gí(t)dt «oo — I g(S;) dt < oo a.s. 
oo 


REMARKS. The function £(1/y) = |logy|?, o > 0, is decreasing, thus, 
part (i) of the theorem applies. For $(y) = 1 — (y + 1) ^, the integrability 
condition is fulfilled if «(B + 1) > 1, thus, selecting b in the range 1 — œ < 
b < af, part (iv) applies for any £, with q(y) = y^. Part (ii) is useful for 
£(1/y) = |log y|"?, p > 0, in which case we can take r(y) = y^,0 <b < 1, to 
meet (iib). 

Note that, for g(y) = y’, 0 < b < 1, the condition on $ ^ can be reformulated 
in terms of ¢, for example, $* (1 — x) = o(x *) is equivalent to 1 — d(y) = 
o(y V2), 


4. Stronglaws. Strong laws for K, and the like for composition derived from 
a transformed subordinator follow by combining the results in the two previous 
sections. Introduce 


t 
(15) L(t) = Í ($'(S,))%du, | L-L(co). 


Recall that notation K, (t) or K, (1) refers to the parts of the partial composition 
produced by the range of (ġ (Su), u € [0, t]). 


478 A. GNEDIN, J. PITMAN AND M. YOR 


THEOREM 4.1. For 0 <a « 1, the regular variation assumption (1) implies 
Kn(t Kn r(t 
a( ) n,r ) S. (—1y71 (2) L(t), 


——— —9 f(t f arre EL tli NS 
Pd — a)n*£(n) ©) rd —a)n£(n) 

with probability 1, as n — oo. And if ¢ satisfies also the conditions of Theorem 3.3, 

we have 

K Kn,r 

: bs un sc cepe e 


——— E à 
rd — a)n*?£(n) L'(1l— a)n9£(n) 


For a = 1, the analogous results are read from Theorem 2.1. 


Asymptotics in the drift case. We sketch the extension to the case of sub- 
ordinator with positive drift d. In this case the length of composition satisfies 
Kn ~ Kn1 7 nA(), where à denotes the Lebesgue measure, thus, the asymp- 
totics follow from the next lemma which generalizes [16], Corollary 5.8. 


LEMMA 4.2. Let (S;) be a subordinator with drift à > 0, and $:[0, co] ^ 
[0, 1] be an absolutely continuous strictly increasing function. Then 


190a f ~ AS dr. 


PROOF. Because (S;) is almost everywhere differentiable with derivative d, 
by the change of variable S; = y, 


à Í d (S) dt = J, $'(y) dy 2 A6 (R). S 


In the drift case it is sensible to distinguish between genuine singleton parts 
which are caused when a sample point hits R, and the other occasional singletons 
induced by open gaps comprising R°. (For fixed n, the composition Cp does not 
allow one to make this distinction.) Denoting K, ;.. and K, ,.. the counts of gen- 
uine and occasional singletons, we have Ky, = K,,1— + Knit + Dopo Kn,r, and 
Kn ^ K5,1—. The asymptotic behavior of variables K, 1. and K,,, for r > 1 is 
then still as in Theorem 2.1, as follows by noting that the gap-counting Theorems 
3.1 and 3.3 and Corollary 3.2 are also valid for subordinators with drift. 

A curious phenomenon occurs for œ = 1: normalizing the Lévy data (v, d) so 
that d = 1, we have all three variables K,/n, Kn,1—/n and Kn, 1 /(n£*(n)) ap- 
proaching the same limit. 


5. Regenerative compositions. We shall specialize the results of the previ- 
ous section to the regenerative compositions appearing when R is the range of 
the multiplicative subordinator $,—21-— exp(— 5,). In this case there is a simple 
connection between $, and L(t) and a nice formula for the moments of L. 


ASYMPTOTIC LAWS FOR COMPOSITIONS 479 


With each multiplicative subordinator, we associate the area process 


A(t) =[a — S) du 


and its terminal value A = A(oo) obtained by taking the infinite integration bound. 
In terms of the (additive) subordinator, 


oo oo " 
A =f exp(—5$,) du =f (1 — S) du 
0 


is the widely studied exponential functional, see [4, 5, 7, 28, 29]. 
Let v be the measure on [0,1] obtained by the exponential transform 
$(y) — 1 — e^? from the measure v. The Laplace exponent is thus given by 


oO 1 
(5) = I (1 — e79)v(dy) = i (1— (1 — x) (dx). 


Because $'(0) = 1, the assumption (1) implies that the tail of the transformed 
measure satisfies D[x, 1] ~ x *Z(1/x) for x | 0. For arbitrary a > 0, the process 
§@ .— 1~(1—5;,)* is itself a multiplicative subordinator with Lévy measure 1 
related to V by v? [x, 1] 2 v[1 — (1 — x)!/*. 1]. The relation between the corre- 
sponding Laplace exponents is (9) (s) = (as). That is to say, 


S() — 1 —exp(—aS;) 
is the multiplicative counterpart of the scaled subordinator (o5;). The area process 
for ($4?) is 
t "m" hi m t 
LO) := | (1— $62) au — | (1 —8,)" du = | exp(—aS,) du 
0 0 0 


and we define L@ = L) (oo) to be the A-functional for (5°). 
For the scaling function yr (x) = x "£(1/x), we have the following: 


THEOREM 5.1. Suppose the Lévy measure fulfills (1), then, for x | 0, the jump 
counts of the multiplicative subordinator S, = 1 — exp(—S;) satisfy 


NOO > LO), — Rv) LG, 


for x | 0 almost surely and in the mean. 

PROOF. We have $'(y) =e”. Then part (iv) of Theorem 3.3 applies, because 
the required integrability holds for any arbitrary positive power and the second 
condition is fulfilled with ga) =x. O 


The application of Theorem 2.1 results in the following: 


480 A. GNEDIN, J. PITMAN AND M. YOR 


COROLLARY 5.2. If the Lévy measure fulfills (1), then, for 0 «a « 1, 
the composition induced by the multiplicatively regenerative set R satisfies, for 
n — oo, 


K,/(n*£(n)) > TO — a)L € 
almost surely and in the mean. And for a — 1, 
K,/(n£*(n)) + LO =A 


almost surely and in the mean, with £* as in (3). 


Generalizations to K, (t) and Kn,» (t) follow in the same way. 

The distribution of L admits some exponential moments, hence, it is uniquely 
determined by its integer moments. They are given by the following formula which 
was recorded for general Lévy processes in [7], though it can be traced back in 
special cases to much earlier literature (see, e.g., [10], page 283): 


k! 


(16) EOF = — — —, 
tı $ (aJ) 


ld MOL M 


EXAMPLE. Consider the two-parameter family of regenerative composition 
structures, as in [16], with 


P[xy,]]22x *(1— xf, O<a<1,0>0. 
In this case we have 
Dl(1—o)T 
(s) = H Ue 
lL(s--1—0o4-0) 
thus, 
T'(0 -- 1)(a --0)(2a --0)--- ((k — Do 4-0) 
T (ka + 0)o 


in agreement with formula (192) from [25]. This specializes for 0 = 0 to the integer 
moments of the Mittag—Leffler distribution with parameter o. 


E(L™)* = 


Now we shall give a new proof of (16) in the case of subordinators. The method 
we use here is not applicable to the exponential functionals of more general Lévy 
processes, as considered in [5, 7], but it is much more elementary and apparently 
more natural in the context of multiplicatively regenerative sets. By the above dis- 
cussion, it is sufficient to prove the formula for œ = 1, that is, for the area functional 
of a multiplicative subordinator. Letting m, = EA“, we wish to show that 


k! 


See, diei 


(17) my — 


ASYMPTOTIC LAWS FOR COMPOSITIONS 481 


Finite Lévy measure, no drift. Suppose first that d = 0 and V is a probability 
measure. Let X; be a sample from v. Then ($+) is a step function whose range is a 
stick-breaking sequence X ;(1 — X1) -- (1 — Xj-1), j = 1,2, ..., complemented 
by 0 and 1. The jumps of (5+) occur at the epochs of an independent homogeneous 
Poisson process. Therefore, A is representable as a random series 
A = E1X, + (E1 + E2) X(1 — X1) + (E1 + E2 + Ex) X3(1 - Xi)(1 X2) ++, 


where E ; are jointly independent exponential random variables with mean 1, also 
independent from the X ;'s. The series is finite only if v{1} > 0. We can also re- 
arrange terms and write A in the form 

A = Ey + E5(1 — Xi) + E30. — XDA — Xo) t, 
from which we deduce 
(18) A € E - (0 — X)A!, 


where A’ is a replica of A, independent of (E, X) £ (E1, X1). By virtue of the 
formula 
E(1— X) 2 1 — (k), 


the expectation is computed as EA = 1/®(1), in accord with the k = 1 case 
of (17). Furthermore, the identity 


k 
A* £y (5) Eki (1 — X)! al 
joo M 

implies that the moments (m4) of A exist and satisfy the recursion 

my, (k) = De a(t — (j))m;. 

j=0 

To solve the recursion, split out the last term and substitute the same identity but 
with k — 1 to arrive at 


my e (Kk) = k(1— (k — 1))ymy-, + kmy 1 (k — 1), 
which is the same as the simple multiplicative recursion 
my (Kk) = kmg, 
whose unique solution with the initial value mo — 1 is that given by (17). 
Replacing the probability measure v by its positive multiple, say, V. = cV, we 
can write a series representation for the corresponding functional A; exactly as 
above, but with E;/c instead of E;. Since E(E/c)* = k!/c*, the same computation 


yields the additional factor c* in the denominator. This agrees with (17) because 
the new Laplace exponent is the multiple c4. 


482 A. GNEDIN, J. PITMAN AND M. YOR 


Finite Lévy measure, positive drift. The moments formula for a subordi- 
nator with drift can be proved analytically using approximation by drift-free 
subordinators, but it is more instructive to inquire into this case separately. As- 
suming v[0, 1] = 1 and d > 0, the subordinator coincides with function 1 — e^ 9! 
for t < Ej, and the jump at time E1 is e^ 9^! X]. The first-jump decomposition is 


A € e-3E(1— X)A' + (1 eE) a, 
with the same convention as in (18). The recursion for moments is obtained by 
using 
E(1 — X) 21— o(k) +ka 
and exploiting the fact that e ^F is distributed according to Beta(1/d, 1), we ob- 
tain 
PO XiHü-9g)rjg TUD — 
a a 120 gk-jl lT(k4-14-1/3) ^ 





The solution is again (17), as justified by the same inductive argument. Repeating 
the above scaling argument, we see that the formula also holds in the case d > 0 
and arbitrary finite v. 


General subordinator. Given arbitrary Lévy data (d, 1), consider the family 
of subordinators with parameters (d, ùe), where V, is a truncated measure that 
coincides with V outside [0, £] and is zero within [0, £]. Using a version of the 
well-known recipe, we can construct the corresponding multiplicative subordina- 
tors S, and Sis using the same Poisson point process in the strip [0, oo[ x[0, 1] 
with intensity measure Lebesgue x Y, so that 


S -1-e* [[G—A)), 


Tj sl 


where the product is over the atoms (t;, A;), and S ; has a similar representation, 
with the only distinction that the factors corresponding to the atoms with A; < £ do 
not enter into the product. By construction, C rt S, as € | 0 and the senversencs 
is uniform in ¢. Thus, by monotone convergence, we have for the corresponding 
integrals Ac(t1, t2) | A(t, t2) almost surely and with all moments, for all 0 < 
t < t; € oo. But all measures V, are finite, thus, as we have shown, the moments 
formula is true, therefore, the formula also holds for (d, v) since the corresponding 
Laplace exponents satisfy ®, ^ o. 


REMARK. For A'a copy of A independent of (S. A (t)) (fixed t), we have 


At, œ É(0—$)A, | A£ AQ) - 0 — SAL. 


ASYMPTOTIC LAWS FOR COMPOSITIONS 483 
This leads to 


k k! _ 1—exp(—t®(1)) 
IE(A (t, œ) = NETS TE exp(—t®(k)), EA (t) = TUS 


Higher moments of A(t) are not immediate because of dependence between 5S; 
and A(t). See [7] for the formulas with t replaced by an independent exponential 
variable. 


6. Poissonized compositions. A closely related type of structure appears 
when the uniform sample of fixed size n is replaced by a Poisson point process of 
rate p. A composition of random weight n(p) appears by separating the Poisson 
points into blocks by means of a random closed set R c [0, 1]. We shall denote 
this poissonized composition Q and provide with “~” all quantities related to it. 


The relation to the fixed-n composition is therefore e, 2 Q,, conditionally, given 
n(p) =n. 

Poissonization is useful for two reasons. Generally speaking, it is a powerful 
technique for asymptotic considerations in combinatorial problems, allowing one 
to explicitly exploit the spatial independence where otherwise only a kind of as- 
ymptotic independence is available (see, e.g., [27] for overview). On the other 
hand, poissonization yields a family of compositions (Cp, p > 0) which satisfies 
a consistency condition analogous to the defining property of partition or compo- 
sition structures [11, 16, 25]. Explicitly, for any o > 0 and x € [0, 1], probability 
distributions of the following compositions coincide: (1) a composition with rate 
parameter o(1 — x) and (ii) a thinned composition which appears when the atoms 
making up a sample with rate p are deleted independently with probability x. 

In the sequel we shall only consider the case when R is the range of a multi- 
plicative subordinator $ -1— exp(— $,), in which case the distribution of (1) also 
coincides with the distribution of (iii), a tail composition of the composition of 
rate p which appears to the right of x, conditionally, given x € R. The last equiv- 
alence is analogous to the regenerative property of the fixed-n compositions [16]. 
The same composition of random integer n (p) appears when the range of additive 
subordinator (5;) is used to separate into blocks atoms of inhomogeneous Pois- 
son process on [0, oo] with exponential intensity measure pe * dx, xE [0, oo]. 
We denote by K p the length of the poissonized composition, and K o (D. K p,r (t) 
stand for the number of parts of the partial composition produced by the range of 
multiplicative subordinator up to time f. 

We proceed with the convergence results which recover and complement the 
results in the previous sections. The equivalence of strong laws for (G,) and (G,) 
is quite obvious, and for quantities like moments, there is a well-developed analyt- 
ical technique of poissonization/depoissonization [19], though we shall use more 
elementary arguments. 


484 A. GNEDIN, J. PITMAN AND M. YOR 


6.1. Recursions. Let F; be the o-algebra generated by the subordinator 
(Sy, u € [0, t]) and by the Poisson configuration on [0, S,]. By the e independence 
property of the Poisson process and the regenerative property of R, the tail com- 
position induced by RN 1S; , llis independent of ¥;. This observation is a source 
of recursions related to the poissonized composition. 

Let pj(o), j =0,1,..., be the distribution of K,. Each p; may be extended 
to an entire function of the complex variable, with initial value p;(0) = 0 [with 
the e only exception. pote) — e ^]. Introduce the factorial moments f (m)(p) = 
EK,(K, —1)---(Kp —m+1), with m =0,1,..., f(p) — 1. 


LEMMA 6.1. The following integral recursions hold. For j —1,2,..., 


i 
| (p;(p) — e™™ p;(p(1 — xy)v(dx)) 


(19) ] 
= i (1 —e7*)p;_1(o(1 — x)? (dx), 
and for m —1,2,..., 
] 
|, (Fw) - £™ (oa =») ax) 
(20) x 


=m Í a — e ?*) f(n-D(5(1 — xyyv(dx). 


PROOF. Each pj(o) may be written as a generating function whose coeff- 
cients are rational functions in the variables d (n), n > 0, for example, the proba- 
bility of one-part composition is 


"O(n: 
pi(p) = x Mio 


n! (n) ' 





where 
] 
O(n:m)= LE x" (1 — x)'v(dx), l<m<n. 


Thus, the statement is of purely algebraic nature and can be translated as a series of 
polynomial identities in these variables. Thus, it is sufficient to consider the “stick- 
breaking case” of finite Lévy measure, normalized to a probability measure, when 
the recursion is proved by conditioning on the first break X = x with distribution V. 
Indeed, there are j blocks when either [0, x] contains a Poisson atom and then 
[x, 1] generates j — 1 blocks [with probability p;—i(o(1 — x))], or [0, x] is empty 
and [x, 1] generates j blocks. This gives 


1 
pjo) = | [e ^" pj(p(1 —x)) + (1— e)p; — x)) 0x). 


ASYMPTOTIC LAWS FOR COMPOSITIONS 485 


To keep this formula right for arbitrary finite Y, we should put p;(p) into the inte- 
gral, then the formula becomes homogeneous in the ®(n)’s and holds in general, 
for algebraic reasons. 

To prove (20), start with the definition 


f€9(p) « Y pj(o)G —D-- Gm, 
j=0 


then multiply both parts in (19) by j(j — 1)---(j — m + 1) and sum over j using 
the identity 


jü-D--G-meD2G-0D-G-m-mG-1--G-meD. O 


Manipulation with power series allows, in principle, computing the distribution 
of K, with all moments, for fixed-n compositions. Let us demonstrate this on the 
expectation f (p) = ER p. For f = EK,, we have the poissonization identity 


oO p" 1 
FOO) = D *- fi. 


nc 


Substituting into (20) and integrating, we obtain a relation between generating 
functions 


e p" = p" 
e? dente OO) = Loi : n). 


Multiplying by e? and extracting the coefficients, we get 


P-P (1) 20 
j=l 





j/ %G) ’ 
which is a familiar expression for EK,, see [16]. 


6.2. Asymptotics. The convergence K o/(Y (1 — a)p*e(p)) > L(9 as. fol- 
lows exactly as in Sections 4 and 5 for 0 < œ < 1 (and with a proper scaling, also 
for æ = 1), in the footprints of Karlin [20], where the Poisson model was treated in 
parallel with the fixed-n case. In this section we show that the recursion: (20) im- 
plies the moments formulae analogous to (16). This can be regarded as a proof that 
the convergence holds together with all moments, and also as yet another deriva- 
tion of the moments formula (16). 

Introducing the poissonized Laplace exponent (not to be confused with ® writ- 
ten in terms of v) 


n 1 
$(p)— [ (1 — e"? (dx), 


486 A. GNEDIN, J. PITMAN AND M. YOR 


we have as p — oo 


eu (P(e) p"£(p); for 0 <a <1, 
Pt) PEN fora = 1, 


(see the Appendix). 


PROPOSITION 6.2. The factorial moments satisfy 
F o) ~ c™ (pe ato) yn. 
with c™ given by 
(21) HELP JA — a) 
(aj) 


for 0 «a < 1. For a = 1, the ‘ieee (1 — a) should be omitted and £ replaced 
by £*, 


PROOF. We concentrate on the case 0 < a < 1, leaving the case œ = 1 to the 
reader. Trivially, / 9 (o) = 1. Suppose by induction that the asymptotics holds for 
some m — 1. Then setting b — I'(1 — o) and e(o) = (op?£(p))", we have 


] 
(22) [erry FY (oa = sax) ~ bc" (0). 


This is shown by splitting the integral at e and replacing the integrand for x € ]0, £] 
by its asymptotics. To justify the induction step, fix ¢ and suppose there exists 
arbitrarily large p such that 


f €? (p) > A+ e)c ? g(o) 


(we wish to lead this assumption to a contradiction). Then, perhaps selecting ¢ 
smaller, for any fixed constant C, there exists arbitrarily large o such that 


f? (p) > (0 4- e)c? (p) - C 


[just because f"? (p) — oo]. Up to the end of this paragraph, p = p(C) will be 
the minimal p for which the inequality holds. Note that o(C) — oo as C > oo. 
Thus, we have 


f 9 (p) = (1+ e)e™ g(p) *- C, 
f™ (o —x)) < 18e) g(o)1—x)"* --C,  xe]0,1), 


and substituting this into (20), we see that the left-hand side is estimated from 
below by 


Í 
(14- e)c? (p) [ (1 — (1 — x)*")v(dx) = (1+ e)c?) e(p) (um) 


= (14- e)c "D be(p), 


ASYMPTOTIC LAWS FOR COMPOSITIONS 487 


where p — oo and we used monotonicity. This disagrees with the right-hand side 
of (20) given by (22), giving the required contradiction. Thus, 


| f €» (p) 
lim sup ————— < Í +e. 
P cg (o) 
A symmetric argument shows that 
(n) 
lim inf in >l-e. 
cen) g(p) 


Letting € — 0 ends the proof. L 


Depoissonization follows rather easily. Recall that the collection of atoms 
of the Poisson process with rate o can be identified with a uniform sample 
{u41,..-,Un(p)}, With n(o) distributed according to Poisson (p). By the obvious 
monotonicity of K, we have 


Knl(n(o( — &)) <n) > Koa-sMn(p(1 — &)) <n), 
Ksl(n(o(1--&)) > n) € KoaseMn(o(1-- &)) > n). 


Selecting n = p and letting o — oo, the elementary large deviation bounds for the 
probability P(o(1 — £) < n(p) < p(1 + £)) imply that, for n — oo, 


Kna—e) « K,« E, 


Letting £ |, 0 and using Proposition 6.2, we see that K, ~ K, almost surely and 
with all moments. 

Observing that the computation of the constants (21) is equivalent to the for- 
mula (16) [thus, we have yet another proof for (16)], and recalling Theorem 2.1, 
we summarize the above discussion in the following theorem. 


THEOREM 6.3. The almost-sure convergence 
K,/(o*£(p)) ^ T —oe)LÓ, ^ p— oo, 
K,/(n*£(n)) — T(0 —a)LO, | n— oo, 
holds for both Poisson and fixed-n compositions together with the convergence of 


all integer moments for 0 < a < 1, and with a proper scaling also for a — 1. 


The convergence of moments of Kn r, Kn(t), K, (t), K pr: K p(t), K p.r (t) 
(with obvious definition of the last two random variables) follows from the theo- 
rem by dominated convergence. 


488 A. GNEDIN, J. PITMAN AND M. YOR 


6.3. A martingale approach. Extending the discussion in the previous sec- 
tion, consider K p(t), which is the number of blocks of a poissonized composition 
produced by the subordinator up to time t. We can view (Kp (t), t > 0) as ei- 
ther an increasing process with unit jumps or a point process of those jump-times 
of (Sj) which have jump intervals covering some sample points. The compensator 
for K,(t) is 


í pw m 
m i (o(1 — $))du 


By observing that ®(p(1 — x)) is the probability that a gap with leftpoint x is hit 
by a Poisson atom, the formula can be first argued in the renewal case. The general 
case follows by extrapolation from the case of finite Lévy measure. This readily 
implies the following: 


LEMMA 6.4. For each p > 0, the process 
Mi = K,(t) — Ci, t€ [O, oo], 


is a square-integrable martingale with unit jumps and quadratic predictable char- 
acteristics (M), = C;. Furthermore, 


EM? = E(K,(t) — C}? =ECy. 


PROOF. The squared jump magnitudes of M, are 1. This implies that the sub- 
martingale M; 2 has the same compensator as K p (D), that is, (M); = Ci, as in [22], 
Section 6.2. n 


The Jemma opens yet another approach to the convergence results, for which 
we give below the .£?-version. Note that the scaling by (o) is asymptotically 
the same as the scaling by ®(p) (see Appendix), which is asymptotic to I'(1 — 
oa)£(p)p* for 0 « a < 1 and to £*(p)p fora = 1. 


THEOREM 6.5. Under the regular variation assumption (1), as p — oo, 


Pn 


K, 
(o) 


almost surely and in £?. An analogous result is valid for K p(t) for each t > 0. 





(23) +> L © 


PROOF. We wish to establish the convergence (23) in £7. Use Lemma 6.4 to 
obtain 


Kp (*»9*0-$», e» $(o(1 — S) 
Dd ce M — M ——— dt. 
V) (SG J, $(p) it) = $0 s | $ 





ASYMPTOTIC LAWS FOR COMPOSITIONS 489 


Also observe that 
B(p(1 — S;)) 
D(p) 


almost surely for each fixed t. Thus, (23) would follow by dominated convergence 
once we could bound 


(25) —(1—$,)* . asp oo 


[ Pol — St)) di 
0 (p) 


from above by a square-integrable random variable. To this end, write (p) = 
p? £o(p) with slowly varying £o [so £o(p) ~ £(p)], then, if the Potter’s bound were 
valid for Zo on ]O, oo[, we could estimate 


$(p(1— $))/9(p) < CU — $)*? 


with some small 6 > 0 and C > 0, whence 
oO a " m 00 _ 
f (S(p — $))/9(o)) dt < C j 0199 deL. 


To make this argument precise, we fix some sufficiently large constant c = X (C, 8) 
required in Theorem A.3(i) and then split the integral at the first passage 
time c1..c/5 of (S;) over the level 1 — c/p. The tail integral 


oo p ~! 
| Slo — $))dt 
O1—e/p 


is bounded from above by a Poisson random variable with mean c, which is ob- 
viously square-integrable, for each c > 0. Thus, it is sufficient to exploit Potter’s 
bound for £9 on [c, oo[. LJ! 


APPENDIX 


A.1, Abel-Tauberian theorems. An exposition of Abel-Tauberian theorems 
for the Laplace transform is given in [9], Section XIU.5. Bingham, Goldie and 
Teugels ([6], Section 1.7 and Chapter 4) give a fuller account, also for more general 
integral transforms, including the Mellin transform. 

We establish next some elementary connections between the integral transforms 
in a form suitable for applications to subordinators. Consider the two Laplace ex- 
ponents (also called Bernstein functions) 


oo 1 
p(s) = I (1 — e79)w(dy) = i (1 — (1 —x)°)5(ax) 
and 
0 


I ; 
$(5):— i (1 — e7**ys(dx) = | (1 — e* 0-67, (ay), 


490 A. GNEDIN, J. PITMAN AND M. YOR 


where the measure V (dx) on ]0, 1] is the image of v(dy) on ]0, oo] via x = 1— e7”, 
and it is assumed that ®(s) < oo for some (and, hence, all) s > 0. The function ® 
is the poissonization of ®, that is, 


oo n 
y= Ye 
95) 7 2,6 E. 


LEMMA A.1. Whatever the Laplace exponent ®, 
lim ($ (s) — ®(s))=0. 


PROOF. With a hint from [8], page 1257, we have, for 0 <x < 1 and s > 0, 


2 1 


0xe*—(1—x)pzsx'e"-«e x. 


Using this we get 
p 1 1 
0< (5) — Bs) = | (e** — (1 —x))(dx) < e! J xv(dx) =e (1), 
0 0 
so the claim follows by dominated convergence. O 


COROLLARY A.2. Abel—Tauberian relations (as s — oo) between the tail of 
the measure v[1/s, co] and the Laplace exponent are the same for (s) and P(s). 


Estimates of the difference can be given under the assumption of regular varia- 
tion. The Laplacian case of monotone density, relating asymptotics of V and ®, is 
covered by a combination of Theorems 3 and 4 in [9], Section 13.5. 


A.2. Potter's theorem. 
THEOREM A.3 ([6], Theorem 1.5.6). Let £ be a function of slow variation at 
infinity. 


(i) For arbitrarily chosen constants A > 1,8 > Q0, there exists X = X(A,6) 
such that 


£(y)/£(x) x Amax((y/x,(y/x) ?) | GE X,y > X). 


Git) If £ is bounded away from 0 and co on every compact subset of [0, oo[, 
then, for every & > 0, there exists A’ = A'(8) > 1 such that 


£(y)/£(x) x A’ max((y/x)*, (y/x) ) ^ (xz0,y 20). 


Acknowledgment. The authors are grateful to a referee for helpful comments. 


ASYMPTOTIC LAWS FOR COMPOSITIONS 491 


REFERENCES 


[1] ARRATIA, R., BARBOUR, A. D. and TAVARÉ, S. (2003). Logarithmic Combinatorial Struc- 
tures: A Probabilistic Approach. European Mathematical Society, Zürich. MR2032426 
[2] BARBOUR, A. D. and GNEDIN, A. V. (2006). Regenerative compositions in the case of slow 
variation. Stochastic Process Appl. To appear. 
[3] BERTOIN, J. (1999). Subordinators: Examples and applications. Lectures on Probability The- 
ory and Statistics (Saint-Flour, 1997). Lecture Notes in Math. 1717 1-91. Springer, 
Berlin. MR1746300 
[4] BERTOIN, J. and Yor, M. (2001). On subordinators, self-similar Markov processes and 
some factorizations of the exponential variable. Electron. Comm. Probab. 6 95-106. 
MR1871698 
[5] BERTOIN, J. and YOR, M. (2002). On the entire moments of self-similar Markov process and 
exponential functionals of Lévy processes. Ann. Fac. Sci. Toulouse Math. (6) 11 33-45. 
MR1986381 
[6] BINGHAM, N. H., GOLDIE, C. M. and TEUGELS, J. L. (1987). Regular Variation. Cambridge 
Univ. Press. MR0898871 
[7] CARMONA, P., PETIT, F, and YOR, M. (1997). On the distribution and asymptotic results for 
exponential functionals of Lévy processes. In Exponential Functionals and Principal Val- 
ues Related to Brownian Motion 73—130. Rev. Mat. Iberoamericana, Madrid. MR1648657 
[8] DuTKO, M. (1989). Central limit theorems for infinite urn models. Ann. Probab. 17 
1255-1263. MR1009456 
[9] FELLER, W. (1971). An Introduction to Probability Theory and Its Applications, 11. 2nd ed. 
Wiley, New York. MR0210154 
[10] FILIPPOV, A. F. (1961). On the distribution of the sizes of particles which undergo splitting. 
Theory Probab. Appl. 6 275—293. MR0140159 
[11] GNEDIN, A. V. (1997). The representation of composition structures. Ann. Probab. 25 
1437-1450. MR1457625 
[12] GNEDIN, A. V. (2004). Three sampling formulas. Combin. Probab. Comput. 13 185-193. 
MR2047235 
[13] GNEDIN, A. V. (2004). The Bernoulli sieve. Bernoulli 10 79-96. MR2044594 
[14] GNEDIN, A. V. and PITMAN, J. (2004). Regenerative partition structures. Electron. J. Combin. 
11 1-21. MR2120096 
[15] GNEDIN, A. V. and PITMAN, J. (2005). Self-similar and Markov composition structures. 
Zap. Nauchn. Sem. POMI 326 59-84. 
[16] GNEDIN, A. V. and PITMAN, J. (2005). Regenerative composition structures. Ann. Probab. 33 
445-479. MR2122798 
[17] GNEDIN, A. V., PITMAN, J. and YOR, M. (2006). Asymptotic laws for regenerative compo- 
sitions: Gamma subordinators and the like. Probab. Theory Related Fields. To appear. 
[18] HITCZENKO, P. and STENGLE, G. (2000). Expected number of distinct part sizes in a random 
integer composition. Combin. Probab. Comput. 9 519-527. MR1816100 
[19] JACQUET, P. and SZPANKOWSKI, W. (1998). Analytical depoissonization and its applications. 
Theoret. Comput. Sci. 201 1-62. MR1625392 
[20] KARLIN, S. (1967). Central limit theorems for certain infinite urn schemes. J. Math. Mech. 17 
373-401. MR0216548 
[21] KEROV, S. (1995). Small cycles of big permutations. Preprint PDMI. 
[22] LAST, G. and BRANDT, A. (1995). Marked Point Processes on the Real Line: The Dynamic 
Approach. Springer, New York. MR1353912 
[23] NERETIN, YU. A. (1996). The group of diffeomorphisms of a ray, and random Cantor sets. 
Mat. Sb. 187 73—84. [Translation in Sbornik Math. (1996) 187 857—868.] MR1407680 


492 A. GNEDIN, J. PITMAN AND M. YOR 


[24] PITMAN, J. (2003). Poisson-Kingman partitions. In Science and Statistics: A Festschrift for 
Terry Speed (D. R. Goldstein ed.) 1-34. IMS, Hayward, CA. MR2004330 

[25] PITMAN, J. (2006). Combinatorial Stochastic Processes. École d'Été de Probabilités de Saint- 
Flour XXXII. Lecture Notes in Math. 1875. Springer, Berlin. 

[26] PITMAN, J. and YOR, M. (1997). The two-parameter Poisson-Dirichlet distribution derived 
from a stable subordinator. Ann. Probab. 25 855-900. MR1434129 

[27] STEELE, M. (1997). Probability Theory and Combinatorial Optimization. SIAM, Philadelphia. 
MR1422018 

[28] URBANIK, K. (1992). Functionals of transient stochastic processes with independent incre- 
ments. Studia Math. 103 299—315. MR1202015 

[29] URBANIK, K. (1995). Infinite divisibility of some functionals on stochastic processes. Probab. 
Math. Statist. 15 493-513. MR1369817 


A. GNEDIN J. PITMAN 

MATHEMATICAL INSTITUTE DEPARTMENT OF STATISTICS 
UTRECHT UNIVERSITY UNIVERSITY OF CALIFORNIA 

P.O. BOX 80010 BERKELEY, CALIFORNIA 94720-3860 
3508 TA UTRECHT USA 

THE NETHERLANDS E-MAIL: pitman G'stat.berkeley.edu 


E-MAIL: gnedin@math.uu.nl 


M. YOR 

LABORATORY OF PROBABILITY 
UNIVERSITY OF PARIS VI 

4 PLACE JUSSIEU, CASE 188 
75252 PARIS, CEDEX 05 

FRANCE 

E-MAIL: deaproba @ proba.jussieu.fr 


The Annals of Probability 

2006, Vol. 34, No. 2, 493-527 

DOT: 10.1214/00911790500000710 

© Institute of Mathematical Statistics, 2006 


ON THE MAXIMUM QUEUE LENGTH IN THE 
SUPERMARKET MODEL 


BY MALWINA J. LUCZAK AND COLIN MCDIARMID 
London School of Economics and University of Oxford 


There are n queues, each with a single server. Customers arrive in a 
Poisson process at rate An, where 0 < A < 1. Upon arrival each customer 
selects d > 2 servers uniformly at random, and joins the queue at a least- 
loaded server among those chosen. Service times are independent exponen- 
tially distributed random variables with mean 1. We show that the system 
is rapidly mixing, and then investigate the maximum length of a queue in 
the equilibrium distribution. We prove that with probability tending to 1 as 
n — oo the maximum queue length takes at most two values, which are 
InInn/Ind+ O(1). 


1. Introduction. We study a well-known queueing model with n separate 
queues, each with a single server. Customers arrive into the system in a Poisson 
process at rate An, where 0 < A « 1 is a constant. Upon arrival each customer 
chooses d queues uniformly at random with replacement, and joins a shortest 
queue amongst those chosen (where she breaks ties by choosing the first of the 
shortest queues in the list of d). Here d is a fixed positive integer. Customers are 
served according to the first-come-first-served discipline. Service times are inde- 
pendent exponentially distributed random variables with mean 1. 

A number of authors have studied this model before, as well as its extension 
to a Jackson network setting [2-4, 11, 13—15, 17]. For instance, it is shown in [2] 
that the system is chaotic, provided that it starts close to a suitable deterministic 
initial state, or is in equilibrium. This means that the paths of members of any fixed 
finite subset of queues are asymptotically independent of one another, uniformly 
on bounded time intervals. This result implies a law of large numbers for the time 
evolution of the proportion of queues of different lengths, that is, for the empirical 
measure on the path space [2]. In particular for each fixed positive integer ko, as n 
tends to infinity the proportion of queues with length at least ko converges weakly 
(when the infinite-dimensional state space is endowed with the product topology) 
to a function v; (ko), where v, (0) = 1 for all t > 0 and (v; (K) : k € N) is the unique 
solution to the system of differential equations 7 


dv, (k 
(1) A ) = À (v: (k d 1)? =e v, (k)^) me (v (K) m v (k + 1)) 





Received April 2004; revised March 2005. 

AMS 2000 subject classifications. Primary 60C05; secondary 68R05, 90B22, 60K25, 60K30, 
68M20. 

Key words and phrases. Supermarket model, join the shortest queue, random choices, power of 
two choices, maximum queue length, load balancing, equilibrium, concentration of measure. 


493 


494 M. J. LUCZAK AND C. MCDIARMID 


for k € N. Here we assume appropriate initial conditions (vo(k) : k € N) such that 
] > vo(1) = vo(2) > --- > 0. Further, again for a fixed positive integer ko, as n 
tends to infinity, in the equilibrium distribution this proportion converges in prob- 
ability to Altd+-+4%~" and thus the probability that a given queue has length at 
least ko also converges to AL dtd 

Although these results refer only to fixed queue length ko and bounded time 
intervals, they suggest that when d > 2, in equilibrium the maximum queue length 
may usually be O (InIn n). Our main contribution is to show that this is indeed the 
case, and to give precise results on the behavior of the maximum queue length. In 
particular, we shall see that when d > 2, with probability tending to 1 as n — oo, 
in the equilibrium distribution the maximum queue length takes at most two val- 
ues; and these values are InInn/Ind + O(1). We show also that the system is 
rapidly mixing, that is, the distribution settles down quickly to the equilibrium dis- 
tribution. Another natural question concerns fluctuations when in the equilibrium 
distribution: how long does it take to see large deviations of the maximum queue 
length from its stationary median? We provide an answer by establishing strong 
concentration estimates over time intervals of length polynomial in n. Our tech- 
niques are partly combinatorial, and are used also in [7—9]. In particular, in [8] 
we use the concentration estimates obtained here to establish quantitative results 
on the convergence of the distribution of a queue length and on the asymptotic 
independence of small subsets of queues, the "chaotic behavior" of tbe system. 

Recently, in [6, 10], a quantitative approximation has been obtained for the su- 
permarket model, including a law of large numbers and a central limit theorem. 
These results rely on properties of continuous-time exponential martingales and 
strong approximation of Poisson processes by Brownian motion. A “localization” 
technique yields tight bounds on the deviation probabilities uniformly in all co- 
ordinates of the infinite-dimensional state space, in a spirit somewhat akin to the 
approach adopted in this paper and in [7-9]. However, the results in [6, 10] concern 
solely fixed-length time intervals and do not extend to equilibrium behavior. 

Let us introduce some notation so that we can state our results. Consider the 
n-queue model. For each time t > 0 and each j = 1,...,n let xj) or X;( 7) de- 
note the number of customers in queue j, always including the customer currently 
being served if there is one. (We shall keep the superscript "n" in the nota- 
tion in this section, but then drop it in later sections.) We make the usual as- 
sumption of right-continuity of the sample paths. Let X (n) or X, be the queue- 
lengths vector (X (9) (1), Paus x” (n)). For a given positive integer n, X (n) is an 
ergodic continuous-time Markov chain. Thus there is a unique stationary distrib- 
ution II? or II for the vector of queue lengths; and, whatever the distribution of 
the starting state, the distribution of the queue-lengths vector X n) at time £ con- 
verges to II? as t — oo. We will show that, with reasonable initial conditions, the 
convergence is very fast. Note that the Lj-norm || X;||; of X; is the total number 


SUPERMARKET MODEL 495 


of customers present at time f, and the Loo-norm || X;llo5 1s the maximum queue 
length. 

The probability law or distribution of a random variable X will be denoted 
by £(X). The total variation distance between two probability distributions 
pj and u may be defined by dry(ni, u2) = sup, |Pr(X € A) — Pr(Y € A)|, 
where the supremum is over all events A, or equivalently by drv(pi, u2) = 
infPr(X + Y), where the infimum is over all couplings of X and Y where 
L£(X) = py and L(Y) = u2. 

For any given state x we shall write £(X A x) to denote the law of X d given 
x = x. For £ > 0, the mixing time v (9 (e, x) starting from state x is defined by 


t” (e, x) = inf(t > O:dpy(£(X!”, x), I1?) <e}. 


Our first theorem shows that if we start from any initial state in which the queues 
are not too long, then the mixing time is small. In particular, if £ > 0 is fixed and 0 
denotes the all-zero n-vector, then t (e, 0) is O (Inn). More generally, this holds 
if € > O is not too small (explicitly, if £^ is bounded polynomially in 7), ||x||1 is 
O(n) and |xlioo is O (In). Observe that the quantity 4,,, below is O if the initial 
state is 0. 


THEOREM 1.1. Let 0 « X « 1 and let d be a fixed positive integer. For each 
constant c > 0 there exists a constant n > 0 such that the following holds for 
each positive integer n. Consider any distribution of the initial queue-lengths vec- 


tor x. and for each time t > 0 let 


à, = Pr(]X$ |, > cn) + Pr(| XQ” loo > nt). 
Then 
dry (£(X£?), 1?) <ne~™ 267 + ôns. 


The O (Inn) upper bound on the mixing time t is of the right order. Indeed, we 
shall see that for a suitable constant 0 > 0, if t < 0 lnn, then 


Q) dry (£ (Xf), 1) = 1 — c-r m, 


Thus z ? (e, 0) is O (Inn) as long as both e~! and (1 — e)~! are bounded polyno- 
mially in 7. 

Our primary interest is in the maximum queue length us" )- IX (n) loo. Since 
the system mixes rapidly it is natural to consider the stationary case. Our model 
exhibits the "power of two choices" phenomenon (see, e.g., [15]); that is, when we 
move from d = 1 choice to d — 2 choices, the typical maximum queue length ME? 
drops dramatically. 

We are most interested in the case d > 2, but first we consider the easy case 
d = | in order to set the scene. This case is straightforward, since in equilibrium 


496 M. J. LUCZAK AND C. MCDIARMID 


the queue lengths are i.i.d. geometric random variables with parameter À. We find 
that the maximum queue length M n is about cya) , and it is not concentrated 
on a bounded range of values (in contrast to the behavior in the balls-and-bins 
model [7], where the maximum load is concentrated on at most two adjacent val- 
ues, even when d = 1). 

Given a sequence of events A1, A2,..., we say that A, holds asymptotically 
almost surely (a.a.s.) if A, holds with probability tending to 1 as n — oo. 


THEOREM 1.2. Let 0 « X « I and let d = 1. For each positive integer n, 
suppose that the queue-lengths vector XU is in the stationary distribution (and 


thus so is the maximum queue length M? X 


(a) For each nonnegative integer m 
Pr(M{”? « m) = (1At, 


Thus if m = m(n) and n — oo, then ME > m(n) a.a.s. if and only if m(n) — 


BaD — —00; and mu? < m(n) a.a.s. if and only if m(n) — E — +00. 


(b) For any subexponential time t > Q (i.e., t = dU. 
]n(1/4. 
( min M") no) —> 1 in probability as n — co. 
O<t<t Inn 


(c) For any constant K > 0, 


In(1/a 
( max M pat had > K 41 in probability as n — co. 
O<t<nk Inn 


Now we consider the case d > 2, when the maximum queue length mM” is far 
smaller, and it is concentrated on just two values mg and mg — 1. This is our main 


result. 


THEOREM 1.3. Let0 <A <1 and let d > 2 be an integer. Then there exists an 
integer-valued function mg = mg(n) = InInn/|Ind + O(1) such that the following 
holds. For each positive integer n, suppose that the queue-lengths vector x is 
in the stationary distribution (and thus so is the maximum queue length M? ). 
Then for each time t > 0, MP is mqa(n) or mg(n) — 1 a.a.s.; and further, for any 
constant K > Q there exists c = c(K) such that 


(3) max |M —InInn/Ind|<c aas. 
0<t<n* 


The functions m2(n),m3(n),... may be defined as follows. For d = 2,3,... 
let ig(n) be the least integer i such that A4 —0/(4—D < 471/21? n. Then we let 


SUPERMARKET MODEL 497 


m»(n) = i2(n) + 1, and for d > 3 let mg(n) = ia (n). (We shall see that with high 
probability the proportion of queues of length at least i is close to A% ~)/@- ) 

Results can be obtained on deviations of the maximum queue length over longer 
time intervals, using arguments just as in [7]; we shall not discuss such results 
further here. 

We have now described our model and stated our main results. The later sections 
of this paper are organized as follows. In Section 2 we prove Theorem 1.1, which 
shows that for each fixed positive integer d the process (X 9) starting from a nice 
initial state mixes in logarithmic time. The proof is based on considering a pair of 
"adjacent" initial states and analyzing a suitable random walk. We give also a simi- 
lar result involving the Wasserstein distance. In Section 3 we focus on the straight- 
forward case d — 1, and prove Theorem 1.2. In Section 4 we show that a Lipschitz 
function of the queue-lengths vector in equilibrium is tightly concentrated around 
its mean. To do this, we consider a queue-lengths process (X n )) starting from 0, 
and use the bounded differences approach to establish concentration at a suitable 
time £ when £(X o) ) is close to the equilibrium distribution. In Section 5 we use 
the concentration property to estimate the proportions of queues of at least some 
given lengths, and to bound their fluctuations over long time intervals. In the short 
Section 6 that follows we establish the logarithmic lower bound (2) on the mixing 
times in Theorem 1.1. In Section 7 we prove Theorem 1.3, and thus complete the 
proofs of the new results stated above. Finally, we make some brief concluding 
remarks. 

Several times we shall use the fact that, if X is a binomial or Poisson random 
variable with mean jz, then for each 0 < £ < 1 we have 


(4) Pr(X — p <—ep) x e (0H 
and 

(5) Pr(X — u > ep) e O/H, 
and if x > 2ey, then 

(6) Pr(X-x)z2*. 


For the above results, see, for example, Theorem 2.3(b) and inequalities (2.7) 
and (2.8) in [12]. 


2. Rapid mixing: proof of Theorem 1.1. In this section we shall in fact 
prove both Theorem 1.1 and a similar result involving the Wasserstein distance 
instead of the total variation distance. The Wasserstein distance may be defined 
by dw(u1, u2) — inf E[]| X — Y |;] where the infimum is over all couplings where 
L£(X) = pg and L(Y) = u2. Observe that for integer-valued random variables, 
the total variation distance between the corresponding laws is always at most the 
Wasserstein distance. The following result will also be used in [8], where we con- 
sider the asymptotic distribution of a queue length. 


498 M. J. LUCZAK AND C. MCDIARMID 


LEMMA 2.1. LetO <À < l1 and let d be a fixed positive integer. For each con- 
stant c > ix there exists a constant n > O such that the following holds for each 
positive integer n. Let M denote the stationary maximum queue length. Consider 
any distribution of the initial queue-lengths vector Xo such that || Xo|l1 has finite 
mean. For each time t > 0 let 


ôn, = 2E[ || Xolli lixolı>cn] + 2enPr(Mo > nt). 
Then 
dw (-£(X;), TI) x ne^" + 2cnPr(M > gt) +2e7™ + 0, 1. 


In order to prove this result we shall couple the queue-lengths process (X+) and a 
corresponding copy (Y;) of the process in equilibrium in such a way that with high 
probability | X, — Y;||; decreases quickly to 0. We model departures by a Poisson 
process at rate n together with an independent selection process that generates a 
uniformly random queue at each event time. If the queue selected is nonempty, 
then the customer currently in service departs; otherwise nothing happens. 

We start the proof with a lemma concerning the return times to the origin of a 
generalized random walk on (0, 1, 2, ...). This lemma will be needed later to show 
that a certain coupling happens quickly. 


LEMMA 2.2. Let $9,041, $2. ... be a filtration. Let Z1, Z2,... be (0, +1}- 
valued random variables, where each Z; is ój-measurable. Let Sy = 0 a.s. and 
for each positive integer j let S; = So + 3 Zi. Let Ag, A1,... be events, where 
each A; is $j-measurable. 

Suppose that there is a constant positive integer ky and a constant ô with 0 < 
ô < 1/2 such that 


Pr(Z; = —1|¢;-.1) > 8 on Aj-1 N{S;-1 € {1,...,ko — 1)] 
and 
Pr(Z; —-—1]6j-1) =>8+1/2 on Aj 10(Si-1 = ko}. 
Then there exists n > 0 such that for each positive integer m 
m m-1 
n(( [) Sy o) N ( AQ ^) < Pr(So > gm) + 2e ™. 
i=l i=0 


PROOF. Let us ignore the events A; in the meantime; we shall see later that it 
is easy to incorporate them into the argument. We define random times J; and J; 
as follows. Let 


Ig = min{i > 1:5; < ko}, 


SUPERMARKET MODEL 499 


and let 
Jj4.1 = min(i > Ij: $; < ko}. 
Further, let Io = Io and let 
Tj41 = mini > 1; + ko: S; < ko}. 


Observe that always I jzluj 
The key fact is that for all positive integers m and j, 


n 

(7) "(fi Si «o < (1— 859 + Pr(; > m). 
ix] 

To see this, note first that if I; = f, then $; < kp and, for 1 < k < ko, on S; =k we 

have 


k 

n [Zu -di) =at, 
u—i 

Now for each i = 0,1,... let Bj be the event that S, Æ 0 for each r = i. re I; -} 

ko — 1. Then Pr(B;|ó;) x 1 — 6", and so Pr(' Bi) < (1 — 99). But 


m Jel 
(As #0) n d; <m) sr N a. 
i=] i=0 
and (7) follows. . 

We need to investigate the times 7; in order to be able to ensure that the 
term Pr(7 j > m) in (7) is small. First we consider the times J;. Note that 
5s 8)r(3 + 8) for r > 0. Let h = 82/4. Then by (4), for all nonnegative 
integers j and r, 


1 
Pr(Ijy1 — 1; > rior) < Pr( 8 (r. 2 + i) « = | «eh 


(Here we are using B to denote a binomial random variable.) Now let s be a 
positive integer. Let b = (8 — 287), so b 0. Note that for r > bs, we have 
DTE «(1— 8)r(À + 6). Hence, again using (4), we see that, on the event (So < s], 





l 
Pr(Io > r|do) < Pr( Br, 5 +8) < e) < e, 


Thus in particular, for r > bs, 
Pr(]o >r)< pu Pr(So >). 


Let Z be a random variable taking positive integer values and such that 
Pr(Z > r) =e” for each positive integer r. Then J j — lois stochastically at most 


500 M. J. LUCZAK AND C. MCDIARMID 


the sum of j i.i.d. random variables, each distributed like Z. Note that Mz (h/2) = 
E[e /27] < oo. Let cı > 0 be sufficiently large that Mz (h/2)e *1^/2 < 1, say 
Mz(h/2)e-9^/? = e— where cz > 0. Then 


Pr(/; — lo > cj) e Hda] E[e 9/20 j-l)] 
< e O04 M (2) 
—e70H. 


Putting the last two results together, and using the fact that I j € Liko, we have for 
each r > bs, 


Pr(/; -r-ejjko) < Pr(Io » r) + Pr(Ljky — Io > cijko) 
ze a Pr(So > 5)-c-e 9 Ko 


We may now complete the proof (still for the case without the events A;). Let 
nı > 0 and m > 0 be sufficiently small that 1 — nı — nociko > 0. Let r = [nim], 
j= [nom] and s = [r/b|. Then for m sufficiently large, r + cı jko < m; and so 
by (7) and the last inequality 


m 
( ( 15i o) < e Pho 4 eh 4 Pr(So > s) + eho 


i=l 
< e omm 1 ,-hmm q Pr(So > s) + eg comm 


Thus there exists a constant 7’ > 0 and an integer mo such that for each integer 
m > mo 


m 
P( () S; o) < Pr(So > n'm) + 2e "m. 


il 


But now we may set 7 = min(z/, In2/mo} > 0 and then for each positive integer m 


m 

e ( N $; x o) < Pr(So > nm) + 2e "^. 
izl 

Let us now bring in the events A;. For each i define Z; = Zi A; — I 


(where A; denotes the complement of Aj). Then Pr(Z; = —1|ój;..1) > 6 on 
{Sj-1 € (1, <--> KO — 1}} and Pr(Z; = —1|$j 1) > 1/2 + 6 on {S;—1 > ko}. Let 
S ¿= so + eee Z;. Then by what we have just proved applied to the Z;, 


Pr( (As eo) nonna] <ee( (5:40) 


i=l i=] 


< Pr(So > nm) -- 2e "^, 


SUPERMARKET MODEL 501 


as required. LJ 


We now introduce a natural coupling of n-queue queue-lengths processes (X;) 
with different initial states. 

Arrival times. form a Poisson process at rate An, and there is a corresponding 
sequence of uniform choices of lists of d queues. Departure times form a Poisson 
process at rate n, and there is a corresponding sequence of uniform selections of a 
queue, except that departures from empty queues are ignored. These four processes 
are independent. Denote the arrival time process by T, the choices process by D, 
the departure time process by T and the selection process by D. 

Suppose that we are given a sequence of arrival times t with corresponding 
queue choices d, and a sequence of departure times t with corresponding se- 
lections d of a queue, where all these times are distinct. For each possible ini- 
tial queue-lengths vector x € Q = (Z^)' this yields a deterministic queue-lengths 
process (x;) with xo = x: let us write x; = s; (x; t, d, t, d). Then for each x c Q, 
the process (s; (x; T, D, T, D)) has the distribution of a queue-lengths process with 
initial state x. 


LEMMA 2.3. Fix any 4-tuple t, d, t, d as above, and for each x € Q write 
S:(x) for s;(x;t, d, t d). Then for each x,y € Q, both ||s;(x) — s;(y)]1 and 
llss (x) — s; Cy) [loo are nonincreasing; and further, if 0 < t < t' and s;(x) < si(y), 
then sy (x) < spy). 


(We shall not need the result about ls; (x) — s;(y)lloo in this paper, but it is 
convenient to record the result for use elsewhere, in particular in [8].) 


PROOF OF LEMMA 2.3. Let fo be a jump time; let xy- = x and yy- = y; and 
let x, = x’ and y; = y’. 
Suppose that tọ is an arrival time. We want to show that 


(8) lx — ylh < lx — yl 
and 
(9) Ix’ — y'lloo < lix — ylloo: 


If the customer joins the same queue in the two processes, then of course x’ — y! = 
x — y, and hence (8) and (9) hold. Suppose that the customer joins queue i in 
the x-process and joins queue j in the y-process, where i Æ j. Then 64 = x(j) — 
x(i) > 0 and ôy = y(i) — y(J) = 0; and by the tie-breaking rule 6, + 8, > 0. 

suppose first that (8) does not hold. Then we must have x(i) — y(i) and 
y(J) = x(j), and so 


x(i) > y(i) = y») + y = x(J) + by = x() + ôx 6, > x(i), 


a contradiction. Hence (8) must hold. 


502 M. J. LUCZAK AND C. MCDIARMID 


Now suppose that (9) does not hold. Then either x (i) — y(i) = lx — y|loo or 
yGg) — xGQ)- llx — ylloo. But we cannot have x (i) — y(i) = [lx — ylloo, since 


xG) — yj) =x) + 8 — (Yi) — by) > x) — y(i). 


Similarly we cannot have y(j) — x(j) = lx — ylloo, and so (9) must hold. 

Suppose now that tọ is a departure time, from queue i. If both queues are non- 
empty or both are empty, then x’ — y’ = x — y, and hence of course (8) and (9) hold. 
If exactly one queue is nonempty, then |x; — y;| = |x; — y;| — 1, and so again (8) 
and (9) hold. | 

The final comment on monotonicity is straightforward. For consider a jump 
time tọ with x, x', y and y' defined as above, and suppose that x < y. If t is a 
departure time, then clearly x’ < y’, so suppose that fp is an arrival time. But if the 
new customer joins queue i in the x-process and if x(i) — y(i), then the customer 
joins queue i also in the y-process, sox’ < y’. CO 


The position of a customer refers to first-in—first-out queue discipline; that is, 
for a given customer in queue j at time f, her position is one plus the number 
of customers in queue j at time £ who arrived before her. Given a queue-lengths 
vector x and a nonnegative integer i, let Z(i, x) be the number of queues with 
length at least i. We shall be interested in £(i, X+), the random number of queues 
with length at least i at time t. Observe that if ||x |]; < cn, then Z(i, x) < cn/i: this 
is how we shall ensure that £(i, X;) is not too large. 

Next, let us consider the equilibrium distribution, and note some upper bounds 
on the total number of customers in the system and on the maximum queue length, 
which follow from the easy case d = 1. 

LEMMA 2.4. (a) For any constant c > Du there is a constant n > 0 such that 
for each positive integer n, in equilibrium the queue-lengths process (X;) satisfies 


Pr(|Xili7 cn) xe 


for each time t > 0. 
(b) For each positive integer n, in equilibrium the maximum queue length M; 
satisfies 


Pr(M,; > k) x nik 


for each positive integer k and each time t > 0. 


PROOF. For both parts of the lemma, it suffices to consider the case d = 1; 
for, as follows from a coupling result in [16] (see also [3]), if d < a’, then in 
equilibrium for each k the number of customers with position at least k with d’ 
choices is stochastically at most the corresponding number with d choices. (Note 
that the maximum queue length M; is at least k if and only if at time ¢ there is at 
least one customer with position at least k.) So, suppose that d = 1. 


SUPERMARKET MODEL 503 


But now by the splitting property of the Poisson process the n queue lengths 
X;(j) are independent; and each has the geometric distribution where Pr(X;(j) = 
k) = (1 — A)A*, with mean A/(1 — A). Thus the total number of customers is a 
sum of n i.i.d. random variables with finite moment-generating function in some 
neighborhood of 0, and part (a) follows easily. For part (b), note that 


Pr(M, > k) < nPr(X;(1) > k) =n)". g 
The bound in part (a) above extends easily over time. 


LEMMA 2.5. Letc> i be a constant. Then there is a constant n > 0 such 
that for each positive integer n, in equilibrium the queue-lengths process (Xi) 
satisfies 


= Pr(IX;||1 > en for some t € [0, e") «2e 77, 


PROOF. Let D « c! « c and let e = c — c' > 0. By part (a) of the last result, 
there exists a constant 7 > 0 such that for each positive integer n and each 1 > 0 
we have Pr(|| X, ||1 > c'n) x e73"; and we may assume that n < ¢/18. 

Let 6 = £/24. Let j = [e"" /8], and consider times f, = rd, for r =0,..., j. 
The mean number of arrivals in a subinterval [t,..1, tr) of length ô is en/2, so 
by (5) the probability that more than en arrivals occur is at most e~®"/°. Then 


Pr(||X;||1 > cn for some t € [0, e""]) 


j j 
< $` Pr(| Xp, li > c'n) + X Pr([5 i, tr) has > en arrivals) 


rz r—i 
« (e^ J8 -2)(e m + e^ *^/) « em 


provided n is sufficiently large, and the lemma follows (for a suitable new value 
of 7). O 


We say that two states are adjacent if they differ by adding one customer to one 
of the queues in one of the states. The following lemma shows that two queue- 
lengths processes (X;) and (X;) will coalesce rapidly if Xo and Xo are adjacent, 
the “unbalanced” queues are not too long and the total numbers of customers are 
not too large. 

First we fix some constants. Let 0 < A < 1 and let d be a positive integer. Let 


C ie and let n > 0 be as in Lemma 2.5. (This will cause no loss of generality, 


as the case c > i of course implies the case c < n in Theorem 1.1.) Let € > 0 


satisfy v = die?! < 1. Let kg = [2c/e]. We shall keep all these constants fixed 
from now on, until the end of the section. 


504 M. J. LUCZAK AND C. McDIARMID 


LEMMA 2.6. There exist constants a, B > 0 such that the following holds. Let 
n be a positive integer. Let x,x' be a pair of states such that x(k) = x'(k) — 1, 
x(J) = x(J) for all j zx k, and ||x'||1 x cn. Consider the queue-lengths processes 
(X;) and (X;), given that Xo = x and Xt = x’. For all times t > «|x'lloo we have 


E|X; — X;|1 = Pr(X; # XP) x e P! 4-2e f", 


PROOF. By Lemma 2.3, X, and X, are always either neighbors or equal, al- 
ways X; < X;, and if for some time s we have X, = X;, then X, = X; forall t > s. 
Thus in particular E| X; — Xil = Pr(X, 4 X’). 

Initially, the queue k is “unbalanced” [i.e., Xo(k) zz X ^ (k)] and all other queues 
are "balanced." Observe that the index of the unbalanced queue in the coupled 
process may change over time. [E.g., suppose that d = 2, and just before an arrival 
time t, queue i is unbalanced, and queues i and j are chosen. Suppose further 
that X, (i) = X, (j) or X, (i) = X, (j). In the former case we select queue i 
for the (X;) process, but we will select queue j for the (X;) process if we chose j 
before i. In the latter case we select queue j for the (X;) process, but we will select 
queue i for the (X;) process if we chose i before j. In both cases, it will now be 
queue j that is unbalanced.] 

Let W; denote the longer of the unbalanced queue lengths at time 7 if there is 
such a queue, and let W, = 0 otherwise. The time for the two processes to coalesce 
is the time T until W; hits 0. We shall use Lemma 2.2 to give a suitable upper 
bound on Pr(W, > 0). The idea is that with high probability the total number of 
customers in the system is not too large, hence the unbalanced queue length W, 
will often be driven below ko, and then there is a chance of going all the way down 
to 0. 

For each time t > 0 let B; be the event that || X;||1  2cn for each s € [0, £). 
It follows from Lemmas 2.3 and 2.5 that there is a constant 7 > 0 such that 
Pr(B;) <2e~™ for each positive integer n and each time t € [0,79], where 
to =e”. To see this, note that if (X7) is a copy of the process such that X; = 0 
a.s., then there is a coupling such that ||X; — X7||; < cn for all times t > 0. But 
then we can couple (X7) with an equilibrium process (X,) so that X < X, for 
all times £ > 0, and thus Pr(|| X/'||; > cn for some t € [0, t9]) < 2e "^. Finally, if 
IX’ lhi x en and || X; — X/|1 < cn, then [[X;||1 < 2en. 

We need some notation concerning the jumps in the unbalanced queue 
length W,. Let N; be the number of such jumps in the interval [0, £]. Also let 
N = Nr, the total number of these jumps. Let 7; be the time of the jth jump if 
N > j, and otherwise let T; be the coalescence time T. 

Let So = x'(k) = Wo, the longer unbalanced queue length at time t = 0. For 
each positive integer j, if N > j, let Sj = Wr, which is either O or the longer 
of the unbalanced queue lengths at time 7;, immediately after the jth arrival or 
departure at the unbalanced queue. Also, if N > j,let Z; be the 3-1-valued random 


SUPERMARKET MODEL 505 


variable S; — S;—1. For each nonnegative integer j, let 5; be the o -field generated 
by all events before time Tj+1. Let also A; be the $;-measurable event Br,,,. 
Let 





I 1 
s= min] v41 sf 
and note that ô > 0 since 0 < v < 1. We shall use Lemma 2.2, with this value of à. 
Note first that the arrival rate at the longer of the unbalanced queues is always at 
most di, and the departure rate is 1. Thus on the event N > j we have 





Pr(Zj — -Ud = = 29 


The key observation is that, on the event {N > j} N Aj-1 N{Sj;—1 = ko}, we have 


1 1 
Pr(Z; = —1|@j-1) = 25175 Fô. 

For consider a time £ > 0. Note that on B; we have £(ko, X; ) < 2cn/ ko < en, that 
is, there are at most en queues with length at least ko. Suppose that T > t, B; holds 
and W; > ko. Arrivals into the system occur at rate ni, and the probability that such 
an arrival joins the longer unbalanced queue is at most dgd-1l. Hence the rate of 
arrivals at the longer unbalanced queue is at most v = dAs^-!, whereas the rate of 
departures is 1, and so the next jump is a departure with probability at least h. 

We have now shown that on the event N > m, Sm — So can be written as a 
sum » 7.1 Zi for (—1, 1}-valued random variables Z; that satisfy the conditions of 
Lemma 2.2, with the same notation for 4, etc. Hence there exists a constant 7 > 0, 
such that for all m > mo = | DE (k)| 


: m-—1 m 
(uw -—min (n a) N (Ass 0))) «2e Wm. 


i=0 i=] 
Let n be sufficiently large that 2mọ < tọ = e", let t satisfy 2mọ < t < to and let 
m = |t/2|. Then, since jumps occur at rate at least 1 while the queue is nonempty, 
by (4) 
Pr((T > t) A (N; < mp x e!/5, 
Also, 


m-i 


es >m}N ( U 2) < Pr(B;) x 2e", 
i=0 
This gives the desired upper bound on Pr(T > t), since 


m-—1 
Pr(T >t) x Pr((T >t} N{N; <m} en >m} n (Ù 2) 
i=0 


epus > m}Nn (n a) ^ (Pus «»)) 


i=0 


i=l 


506 M. J. LUCZAK AND C. MCDIARMID 


Thus we have shown that 
Pr(X, z X1) xe P! 4-2, fn 


whenever 2mo < t < to. 
Now we check that we can drop the upper bound fg on £. For let n be sufficiently 
large that e^ Pfo < eP”, Tf t > fo, then 


Pr(X, Z X) x Pr(Xs # Xp) x 3e P", 
and so 
Pr(X, z Xi) x e P! .- 39 P" 


for all t > 2m. 

Finally, by replacing 6 with a smaller constant B > 0, we can replace 3 by 2, and 
ensure that the inequality holds for all positive integers n, not just for sufficiently 
largen. O 


Recall that we have fixed a constant c > 4/(1 — A) and that we are using the 
coupling introduced before Lemma 2.3. 


LEMMA 2.7. Leta and B be as in the last lemma. Let (X4) and (Y;) have any 
initial distributions. Let t > 0, and let the “bad” set B be the set of z € Q = (Z*)y 
such that |z||; > cn or |z|loo > to^. Then 


ELI X: — Yid esyomsez)l < Zene --2e77^). 


PROOF. Given two distinct states x and y in B = QX B, we can choose a path 
X = Z0, Z1, ---, Zm = y Of adjacent states in B from x down to the all zero state 
and back up to y, where m < ||x||1 + [lyll1. Let us write (X7) to denote the queue- 
lengths process starting at x, and similarly for (Y?) (so in fact X? = Y;* always). 
By the last lemma, for all states x and y in B 


m-—i 
EX — Y? hl x Y. EUIX£ — XP) < Mæla + lili (e P 4-2e 7"). 
i—0 


Hence the lemma follows. (] 


We may now complete the proofs of Theorem 1.1 and Lemma 2.1, by tak- 
ing (Y;) to be in equilibrium, with Xo and Yo independent, and handling the "bad" 
initial states appropriately. Consider first Theorem 1.1. Note that 


dyv (4 (X;), II) < Pr(X; z Y) 
< E[1x zv, lix egyoyyeg)] + Pr(Xo € B) + Pr(Yo € B) 
< E[l| X; = Y, Ili]ixseaintysezil + Pr(Xo = B) + Pr(Yo € B). 


SUPERMARKET MODEL 507 


But Pr(|Yo[1 > cn) = e7~@™ (see Lemma 2.5), since c > A/(1 — A). Also, 
from (25) in Section 5, for each j € (1,...,) and each nonnegative integer i, 
Pr(Yo(j) > i) <A’; and it follows that 


Pr(lYolloo > ta!) < ne 9, 


Theorem 1.1 now follows from Lemma 2.7. 
Finally let us complete the proof of Lemma 2.1. Note that dw(£(X,), ID < 
E[]| X; — Y;]1]. We break EL| X; — Y;||1] into the sum of two parts 


El Xs — Yell Vx exyinysezid + EL Xe — Yillilixoeniutroen]- 


The first part is bounded in Lemma 2.7. For the second, we have by Lemma 2.3 
that 


E[l X; — Y:llilxceByutroen)] 
< E[|] Xo — Yollilixoenjutroeni] 
< E[(llXoll: + llYoll D (Mixoiisen + Lpxoii<cn,t¥oi>cn) | 
+ E[(|Xoll1 + Yol Lyx) <en, voti sens maxtll Xolloo; | Yolloo}>to—! | 
< E[l XolliLpxoyy>en] + EL Yo h Pr Xoli > cn) + cnPr(|lYolli > cn) 
+ E[IlYolhi1gyogi sen] + 2n (Pr(llXolloo > to^) + Pr(llYolloo > ta"), 


where the last inequality uses the independence of Xo and Yo. Now we may use 
the estimates above concerning Yo, together with the fact that E[lYol]1] < ih, 
to complete the proof. (Note that 





À 
EJI Yol ]Pr(]] Xol1 > cn) < = xnPrClXoll > cn) < EllXollilyxopisca] 


since c > $r) 
3. One choice: proof of Theorem 1.2. LetO <à < 1 and let d = 1. Let Xo be 
in equilibrium. 


Part (a). The queue lengths X;(j) for j =1,...,n are independent geometric 
random variables with parameter 2, and so Pr(X;(j) > m) = A" for each nonneg- 
ative integer m. Hence 


Pr(M; < m) = (1 — A” th", 
and so 
(10) exp(—nA" t! /(1 — 1*1) < Pr(M, < m) < exp( 2477), 
The rest of part (a) follows easily. 


508 M. J. LUCZAK AND C. MCDIARMID 


Part (b). The proofs we give for parts (b) and (c) are similar to the proofs 
of the corresponding parts of Theorem 1.2 in [7]. Let V = minoz;«« Mj. 
Let e > 0, let m = m(n) = |(1 — e) and let 6 = =. Consider the 
times iO for i —0,1,...,[v/0]. Let A be the event that Mig < m for some 
i € (0, 1,..., [/0]). We have nA” = Q (nê), and hence by (10) 


Pr(A) x (2/0 + D exp(—nA") = o(1). 


Let B be the event that some queue receives at least two customers in some time 
interval [i6, (i + 1)0), where i € (0, 1,..., [7/0]. For each queue, the number of 
arrivals in the interval [i0, (i + 1)@) is Po(A@), and so the probability that there are 
at least two arrivals is at most (40)*. Hence 


Pr(B) < (7/0 + 1)n(40)? = O(nx0) = o(1). 
But 
Pr(V x m —2) < Pr(A) + Pr(B), 
and so V > m — 1 a.a.s. But by part (a) 


Inn 


V«Mox( ten 





and part (b) follows. 


Part (c). Let Z = MaXg<t<nx Mr. Let e > 0. We show first that 
Inn 
In(1/A) 


We argue much as in the proof of part (b). Let 0 = exp(— Inn/ InInn). Consider the 
times i0 for i =0,1,..., [n /0]. Let k = k(n) = [(K + 1 + e/2) Inn/1n(1/A)], 
and let A be the event that Mig > k for some i € (0, 1,..., [n^ /0]). Then since 
Pr(Mo > k) x nA*, 


Pr(A) < (nf /6 + 1)n3* 





(11) Z<(K+1+6) 


= exp((K +1+0(1))Inn — (K +1+6/2+ o(1)) Inn) 
exp(—(e/2 + o(1)) Inn) 


— 0 as A — Co. 


I 


For each queue, the number of arrivals in the interval [i0, (i + 1)@) is Po(A@), and 


Pr(Po(A9) > Inn/(InInn)?) < (26)'9"/Gainny” 
= exp(—Q (In? n/ (InIn n)?)). 


SUPERMARKET MODEL 509 


Thus, if B is the event that some queue in some interval [/0, (; + 1)8) where 
i € {0,1,..., [nE /0]) receives at least Inn/(InInn)^ customers, then 


Pr(B) = exp(-Q (In n/(InInn)))). 


But, for n sufficiently large, 





Pr(z > (K +1+6) mas) X Pr(A) + Pr( B) — 0 as n — oo, 


and so (11) holds. 


Now let 0 < e < 1, and let k = k(n) = [(K + 1 — e)Inn/In(1/4)]. We will 
show that Z > k a.a.s., which will complete the proof of this part and thus of the 
theorem. For each time t > 0, let $; be the o-field generated by all events until 
time t. Let c > A/(1 — A). Let C be the event that | X; |l; < cn for each t € [0, n£]. 
Then C holds a.a.s. by Lemma 2.5. Also, by Theorem 1.1 there are constants no 
and 7 > 0 such that the following holds. Let n > no and consider the system with n 
queues. Let x be a queue-lengths vector such that |x||1 < cn and |[xjlo <k — 1. 
Then, given Xo — x, 


dio (£(X,), WI) <e7 
for all times ¢ - ti = n7! In^ n. In particular, for n sufficiently large by (10) 
Pr(M, xk—1|Xo—x) < g M gcn 
Thus, since the system is in equilibrium, for each i = 0, 1, ... 
Pr(Mitpa £k — lái) ze" 4 en 


on the event D; = {|| Xin ll; < cn) N (Mis < k — 1}. Hence if we denote [n^ /1 | 


by io, 
ig 
n( () p. 
i=0 


< Pr(Do) H Pe( Dis N pi) 
i=0 j=0 

< (e 7^ p e7 ny 

< exp(—(n* /1)n- (6 **»0») 

= exp(—ne to) 


— 0 as n — co. 


Pr({Z <k—1}NC) 


lA 


Since C holds a.a.s., it follows that Z > k a.a.s., as required. 


510 M. J. LUCZAK AND C. MCDIARMID 


4. Concentration. In this section we prove concentration of measure results 
for the queue-lengths process (X;). Let n be a positive integer, and let $2 be the 
corresponding set of queue-lengths vectors, that is, the set of nonnegative vectors 
in Z”. Let us say that a real-valued function f on Q is Lipschitz (with constant 1) 
if | 


fe — fo s Mx — yli 


for all x, y € €2. Let d be a fixed positive integer. The key result is the following 
lemma. 


LEMMA 4.1. Thereisa constant c > Q such that the following holds. Let n > 2 
be an integer and consider the n-queue system. Let the queue-lengths vector Y 
have the equilibrium distribution. Let f be a Lipschitz function on &3. Then for 
each u — 0 


Pr(|f (Y) - ELFO] > u) x ne *"/" 


1/2 


Recall that £(k, x) denotes |j :x(j) > k}|, the number of queues of length 
at least k; and observe that for any fixed k this is Lipschitz as a function of x. 
We deduce from the last lemma the following result concerning the random vari- 
ables £(k, Y). 


LEMMA 4.2. Consider the n-queue system, and let the queue-lengths vec- 
tor Y have the equilibrium distribution. For each nonnegative integer k let £(k) — 
E[£(k, Y)]. Then for any constant c > 0, 


Pr(sup AK, Y) — £(k)| > cni? In? n) = e~en), 
k 


Also, for each integer r > 2 


sup |E[£(k, YY] — £(ky | = O(n"! 1n? n). 
k 


PROOF. We argue as in the proof of Lemma 5.2 of [7]. For the first part, let 
cı > 124, and note that by Lemma 2.5 


Pr(£([cin], Y) > 0) = eg 9. 


It follows that we may restrict attention to queue lengths k < cin. For, since always 
£(k, Y) <n we have £([c1n]) < 1 for n sufficiently large; and then 


Pr( sup |£(k, Y) — £()| = 1) < Pr(£([cin], Y) = 1) 2 e 99, 


k>cın 


Now the first part of the lemma follows easily from Lemma 4.1. 


SUPERMARKET MODEL 511 


For the second part, fix an integer r > 2. By Lemma 4.1 there is a constant 
C)  Ü such that, if we set u = conl/^ Inn, then 


sup Pr(|Z(k, Y) — £(k)| > u) 2 o(n ^^). 
k 


Hence, for each positive integer s <r, 
E[|Z(k, Y) — £()] x uS + n Pr(|£(k, Y) — £(k)] > u) zw? 4-o(1), 
uniformly over k. The result now follows from 
0 x E[£(k, Y) ] — £(k) 


= 3 'E[(£(k, Y) — £0)) £0) ^? 


5—2 


< Y EIE, Y) — en" 
EPA 


= O(n' 1n n), 
uniformly over k. C] 


Our proof of Lemma 4.1 will follow the lines of the proof of Lemma 5.1 in [7]. 
The task is somewhat easier here, and we obtain tighter bounds, since for each 
fixed n the departures process has a bounded rate. Departures occur as events in a 
Poisson process at rate n, each one from an independently selected uniformly ran- 
dom queue, except that departures from empty queues are ignored. As in [7], along 
the way we prove concentration for Lipschitz functions of the time-dependent 
process for “nice” initial conditions—see Lemma 4.3 below. 

An overview of the proof is as follows. Consider a queue-lengths process (X;) 
where Xo = 0. For t > 0, let Z; be the total number of arrivals in [0, t], and let Z 
be the total number of departures in [O, 7] (including “virtual ones,” i.e., departures 
from empty queues). Thus Z; ^ Po(Ant) and Zi ~ Po(nt). Let u; = ELf (XÐ), 
and u;(z,Z) = El f(X;)|Z, = z, Z; = Z]. We use earlier coupling results and 
the bounded differences method to upper bound Pr(|f (Xi) — Hkz, z)| > ul 
Z, = z, Z; — Z). ] z 

Next we remove the conditioning on Z; and Zz. To do this, we choose suitable 
"widths" w and Ù, and use the fact that Pr(|Z, — Ant| > w) and Pr(|Z, — nt| > w) 
are small, and for z,z such that |z — Ant| < w, |Z — nt| < w the difference 
lut (zZ, Z) — ul is at most about 4(w + i»). We thus find that Pr(| f (Xi) — ul > 
5(w + w)) is small. The part of the proof up to here is encapsulated in Lemma 4.3 
below. Finally we use the mixing results, Theorem 1.1 and Lemma 2.1, to relate 
the distribution of X, to the equilibrium distribution. 

Let us start on the details of the proof. In this section we shall use the following 
lemma with xo — 0; we consider more general initial states for later use. 


512 M. J. LUCZAK AND C. MCDIARMID 


LEMMA 4.3. There isa constant c > 0 such that the following holds. Let n > 2 
be an integer and let f be a Lipschitz function on Q. Let also xo € Q and assume 
that the queue-lengths process (X;) satisfies Xo = xo a.s. Then for all times t > 0 
and all u > 0, 


(12) Pr(|f (X?) — url 2 u) x ne ino. 


PROOF. Note first that we may assume without loss of generality that 
f (xo) = 0, and so | f (X)| x Zt + 2 since we could replace f(x) by its trans- 
lation f (x) = f(x) — f (xo). Let z and z be positive integers, and condition on 
Z; =z and Z; =z. Now f(X;) depends on 2(z + Z) independent random vari- 
ables 71,..., Tz, Dy,..., Dz, T, er T;, Diss D; which specify the arrival 
time and corresponding choice of d queues for each of z customers, and the Z 
departure times and corresponding selections of a queue during [0, 7]. This prop- 
erty relies on the well-known fact that conditional on the number of events of a 
Poisson process during [0, 1], the arrival times are a sample of 1.1.d. random vari- 
ables uniform on [0, t]. Let T= (T1,..., Tz), D = (D,,..., Dz), T= (T, ios T;) 
and D — (D scs D;). We may write f (X;) as g(T, D, T. D) where 


g(t, d, t, d) ee f (si (Xo, t, d, t, d)), 


in the notation of Lemma 2.3. 

We now prove that, conditional on Z; =z and Z: = 7, the random vari- 
able f(X;) is strongly concentrated, by showing that g(t,d, t, d) satisfies a 
"bounded differences" condition. Suppose first that we alter a single coordinate 
value dj or d j. Then the value of g can change by at most 2; by Lemma 2.3 start- 
ing at time t; with |x: — yr; ll1 < 2. Similarly, if we change a coordinate value 
t; or £j, the value of g can change by at most 2; we may see this by applying 
Lemma 2.3 once at the earlier time and once at the later time. Now we use thé in- 
dependent bounded differences inequality; see, for instance, [12]. Hence, for each 
u 0 


"E FONT 2 
Pr(Ig(T, D, T, D) — E[g(T, D, T, D)]| > u) < 2ep( - 1775) 


In other words, we have proved that for any u > 0 


2 
= u 
(13) (1 f(X1) — p G, 2)! IZ 1 = Z) p 4(z -Z) 
which is the desired upper bound on the quantity on the left-hand side. 
Next we will remove the conditioning on Z;. We will choose suitable “widths” 
w = w(n) > 0 and w = w(n) > 0, where 0 € w € Ant and 0 < w < nt. Let I de- 
note the interval of integer values z such that |z—Ant| < w; let J denote the interval 


SUPERMARKET MODEL 513 


of integer values Z such that |Z — nt| < Ù. Since Z, ~ Po(Ant) and Z, ~ Po(nt), 
by inequalities (4) and (5), 


2 
(14) Pr(Z, ¢ I) = Pr(|Z; — Ant] > w) < 2exp(— ue J 
and 
ip? 
(15) Pr(Z; £ D- Pr(|Z; —nt|>w)< 2exp(— án x) 


We shall assume that w > 2(Ant Inn)? and Ù > 2(nt Inn)!/?, and so it follows 
by (14) and (15) that 


(16) E[(Z; + Z0liz enugn] 7 0(- 


From Lemma 2.3, for each z 


(17) Itu(z +1, Z) — mz z)zl 
and 


We claim that for each z € I and ž € Ï, 
(19) Iur. Z) — fer | € 2(w + w) + o(1). 
To prove this, observe that 


zel,zel 
Hence by (16), since | f (X;)| x Z; + J; 


p, X max (ju, 2)) + EZ +Ž0)Lzenuženl 


zel,zel 


« m Gu, Z)} + o(1), 
ZE 


and by (14) and (15), 


m> min (us, 2))Pr(Z, € I, Z, € D) --e(1) 
Hos 


ze 


> min (u;(z,z)) 4 o(1). 
zelL,zcl 


Now we may use (17) and (18) to complete the proof of (19). By (13), (14), (15) 


514 M. J. LUCZAK AND C. MCDIARMID 


and (19), 
Pr(| f (X?) — iil = 3(w + w)) 
zy Pr(|f (X) — i| = 300 + IZ; =z, Z, = z)Pr(Z, =z, Z, — £) 
zel, zel 
+ Pr(Z; ¢ I) + Pr(Z, ¢ I) 
< 5, Pr( f(X) — m, Dl > w+ b+ o(1)|Zi =z, Ži — £) 


zel,zel 
x Pr(Z, == Z, Z E) 
+ Pr(|Z; — Ant| > w) + Pr(|Z; — nt| > w) 


^ 1 2 2 “2 
2 exp( ae) + 2exp(- =") +2exp( 7) 
n 


— 4nt +nittwtw) 3Ant 3 
(1 4- o(1)) Qw + zd l w? ) ( i? ) 
< Ei e ah LA UTE DENN, ER, Ed d 
> exp( Bat Dm | OPU 3am) Sa 
Now let u and t satisfy 
(20) 12(nt Inn)? < u < 6Ant. 


Let w = w = u/6. Then u = 3(w + Ù); and w and w are as required, that is, 
2(Ant Inn)!/ 2 <w < Ànt and 2(nt In 5)! < w X nt. Hence for n sufficiently 
large we have 





u? 
Q1) Pr( f) — uz u) < exp(— us) 





But if u x 12(nt Inn)!/2, then exp(— dE) > n`}, Thus, as long as u < 6Ant we 
have 





uz 
Pr(/ (X) = url = u) <nexp(——), 


Now let us get rid of the upper bound on u. If 6Ant < u < 6ent, then by the above 
of course 


u 


A2 a2 
e >u)< mee oe us d 
Pr(|f(X1) — ml zu) < nexp( 144e? 7 


Finally consider u > 6ent. We saw that |f (X)| < Z: + Z;. Thus |u| < 
E[| f (X))1] x (4 -- A)nt x 2nt. Hence, if u > Gent, then 


Pr(| f (X1) — plz u) < Pr(|f (X;)| = 2u/3) 
< Pr(Z, > u/3) + Pr(Z, > u/3) 
< 2Pr(Z, > u/3). 


SUPERMARKET MODEL 515 


But Z, ~ Po(nt) and u/3 > 2ent, and so by (6) the last bound is at most 2!~¥/9. 
The lemma now follows. |] 


We shall use Lemma 4.3 here with Xo — 0 to complete the proof of Lemma 4.1. 
As we saw before, we may assume that f (0) = 0, and hence always | f (x) < lxi. 
It remains to relate the distribution of X; with Xo = 0 to the equilibrium distribu- 
tion. But by Theorem 1.1 there exists a constant 7 > 0 such that, for each positive 
integer n and each time f > 0, if Y has the equilibrium distribution, then we have 
dy G£ (X4), £(Y)) < ne~™ + 2e 77". Also, by Lemma 4.3 we may assume that 
n > 0 is sufficiently small that, for each n, each t > 0 and each u > 0 


Pr(| f (Y) — us| =u) < drv(4£ (Xi), L(Y) + Pr(Lf (X?) — ul = 4) 
nu? 
« ne ™ 42e ™ +n ep( -2 ) ^ne "m 
n 


Further, we may assume that n > 0 is sufficiently small that also Lemma 2.1 holds 
with this value of 7. 

Now let « = max{1,In7!(1/A)}, let u > 3«g^!nl/? Inn and let t = n7'/2u. 
Then 


2 
u 
ez nt > 3k inn. 
nt 


Thus by the above, 
Pr(|f(Y) — us| > u) x 3ne ™ 4-267, 
But u: = E[ f (X;)], and so by Lemmas 2.1 and 2.4 
|j — EL f(Y)]| < dw(£(X»), L(Y) = o(D, 
since X9 = 0 and so ôn, t = 0, and nt > 3x Inn. Thus we find that 
Pr(|f(Y) — ELf(Y)] > u + 1) < 3nexp( nn! u) + 227" 
< (3n --2)exp( —3n ^u) 


for all n sufficiently large and all 3x9 !n!? Inn <u x n??, 
Let c = zL. Note that ne" ^" > 1 foru < 3n 1 nV? Inn. Hence, since also 
c=, 
Pr(| f (Y) — ELf(Y)]| > u +1) < (3n +2)exp(—cn7!/*u) 


for all n sufficiently large and all 0 < u < n?/?. 
We may assume that 0 « c < 1. Then we may replace u + 1 by u if we replace c 
by c/2; that is, 


Pr(|f (Y) — ELFO) = u) < Gn + 2) exp(—(c/2)n7"/*u) 


516 M. J. LUCZAK AND C. MCDIARMID 


for all n sufficiently large and all 0 < u < n?/?. For if u < 2, then the right-hand 
side above is at least 1, and if u > 2, then u > u/2 + 1. Now consider square 
roots: if we replace c by c/2 again, then we may replace the factor (3n 4- 2) by 
(3n + 2)!/2, which is at most n for n > 4. Thus, with a new c, we have 


Pr(| f (Y) - E[f(Y)]] = u) < nexp(—cn t u) 


for all n sufficiently large and all 0 < u < n?/7. One further minor adjustment 
to c lets us assert that the last inequality holds for all integers n > 2 and all 
O<u<n?/?, 
It remains only to consider values of u > n?/?, But always | f (y)] < Ilyllı, and 
E[]Y Ili] < gêz”. Thus | f(Y) — ELf(Y)N < IY l1 + pga. Hence for u > pon, 
Pr(|f (Y) — Elf (Y)]| > u) < Pr(lY ll: > EUY lh] 47 0/3) =e 95, 


from the proof of Lemma 2.4 and a standard large deviations calculation for a 
sum of independent geometric random variables (see [5], pages 201—202), and so 
Lemma 4.1 follows. 

Lemma 4.3 is a quantitative version of some earlier results in [2—4, 13, 14, 17]. 
It will also be used in [8] for analyzing the asymptotic distribution of the length of 
a given queue and for analyzing the "propagation of chaos." 


5. Balance equations and long-term behavior. In this section we consider . 
the system in equilibrium, and present the key equation (24). We then show that the 
expected number £(i) of queues with at least i customers is close to nA! +4 peddo 
and that the random number £(i, Y;) of queues with at least i customers stays close 
to this value over long periods of time. Here (Y;) is a queue-lengths process in 
equilibrium. We shall denote Yo by Y below. As before, (X;) will denote a queue- 
lengths process with a "time-dependent" distribution. Observe that Abra nd 
equals A if d = 1, and equals AG -D/(—-0 ifd > 2. 

Let d be a fixed positive integer. Fix also a positive integer n, and consider 
the corresponding set & of queue-length vectors. For x € Q and a nonnegative 
integer k, let u(k, x) be the proportion of queues j of length x(j) at least k. Thus 
always u(0, x) = 1. Let u(k) denote E[u(k, Y)]: thus u(&) = £(k)/n. We also let 
ur (k) = E[u(k, X:)]. 

If f is the bounded function f(x) = u(k, x), then the generator operator A of 
the Markov process (see, e.g., [1], Section 1.1, and see also [17]) satisfies 


Af (x) =A(u(k — 1, x) — u(k, x)f) — (u(k, x) — u(k + 1, x)). 


[Compare with (1) earlier.] This is true, since u(k, x) — u(k +1, x) is the proportion 
of queues of length exactly k, and u(k — 1, x)“ — u(k, x)? is the probability that the 
minimum queue length of the d attempts is exactly k — 1. From standard theory 


SUPERMARKET MODEL 517 
(see [1], Chapters 1 and 4), for each bounded function f, whatever the initial 
distribution of X9, 


dE) _ 
— 1 = BIAS (Xi), 


and so in particular, for each positive integer k, 
dux(t) 
dt 
As Y is in equilibrium, 


(23) A(E[u(k — 1, Y)^] — Efu (k, YY*]) — (u(k) — u(k + 1)) — 0. 





(22) = A(Efu(k — 1, X,)*] — Elu(k, X1)°)) — (us (k) — ui (k + 1)). 


We shall consider only the equilibrium case for the remainder of this section, and 
rest our analysis on (23). Now 


Ix l 1 
X ulk, x)=- 3 xG) = xli, 
k>1 ares m 


and so 


1 
Du) = ELIY Ili] < oo. 


k>1 


Hence u(k) — 0 as k — oo. Also E[u(k, Y)7] x u(k). Summing (23) for k > i we 
obtain, for each i == 1,2,..., that 


(24) AE[u (i — 1, Y)7] — u(i) — 0. 


(This is equivalent to saying that E[Af(Y)] = 0, where f(x) is the number of 
customers with position at least i, but since f is not bounded we cannot assert the 
result directly.) Equation (24) is crucial to our analysis. 

Note first that by (24), u(1) =A, and for each i = 2,3... 


u(i) = AE[u(i — 1, Y] x Au(i — 1). 
Thus by induction on i 
(25) u(i) 3 — foreach i —0,1,2,.... 


By (24) and the second part of Lemma 4.2, there exists a constant c1 > 0 such that 
for each positive integer n, 


(26) sup Iu (i) — Au(i — 141 <cyn7! In? n. 


We claim that, for some constant c2 > 0, for each positive integer n 


(27) sup|u(i) — A1*4*7*4 | < con In? m; 
i 


518 M. J. LUCZAK AND C. MCDIARMID 


and hence 
(28) sup|£i) — na t4 "| < on? 
i 
for all positive integers n. 
Let us prove the claim. Note first that for x, y > 0 we have 

ly? — x^| = |y — x|( 47! H yt ?x te E x17) 

(29) 
< d(x U y) y — xl. 
Now by (25), u(i) <A! and Altd+--+4™™ < Ai, so by (29), 
lu(i 4-1)— pitata | 
(30) < lui +1) — Au (i) + alu? — (167 1| 
< Inti +1) — Au(DA| - Ad «ACD Jugi) abe! |. 

Since u(Q) = 1, an easy induction on i using (26) and (30) gives 

lu(i) — gitdte ta < (Zoa) cuan ln? n 

j=0 


for each i = 1,2, .... Let io be the least i such that Ad - M@-D < 1/2. Let co = 
cı : max(2, AH (Ad) . Clearly 
lua) — ptt «cn In? n 


for i =0,1,...,i9. We shall prove by induction on i that this holds for all i. Let 
i > ig and suppose that the inequality holds for i. Then by (26) and (30) 


j Fd! "n = 
lu(i +1) DP i | cin! In^ n + Icon in? n 
< con | In^ n, 


as required for the induction step. Thus we have proved the claim (27). 

This completes the first half of our task here. Now let K > 0 be an arbitrary 
constant, and let t = n£. We see next that all the coordinates £(i, Y;) are likely to 
stay close to £(i) throughout the interval [0, t]. 


LEMMA 5.1. Let (Y) be in equilibrium and let c > 0 be a constant. Let B, 
be the event that for all times t with 0 <t <T 


i-i 
sup|£(, Yp) — na *4*7*7^ | < en! 7? o? n. 
i 


Then Pr(B;) < e- 9 à m. 


SUPERMARKET MODEL 519 


PROOF. By the first part of Lemma 4.2, there exists y > 0 such that for all n 
sufficiently large, for each time t > 0 


Pr(sup I£(i, Y) — £@)| > en! in^n/4) < ey inn 
i 


Let s = cIn?n/(8An!2), let j = [t/s] and consider times rs for r —0,...,j. The - 
mean number of arrivals in a subinterval [(r — 1)s, rs) is cn"/ ?1n?n /8, so by (5) the 
probability that more than cn!/? In? n/4 arrivals occur is at most go "d nf2A < 
e^ forn sufficiently large. Then 


Pr(sup |£(i, Yi) —£@)| > cn? In^ n/2 for some f£ € [0, d) 
i 


j 
< Y Pr (sup IEC, Yrs) —€@)| = en! In? n/4) 


r=0 


j 
+ XC Pr([(f — Ds, rs) has > cn! 1n^n/4 arrivals) 


r-—i 
€ (t/s 4-2) 2e i^n 


< e In? n 
for n sufficiently large. Now we can use (28). 0O 


6. Lower bound on mixing times. In this short section we show that the 
upper bounds on the mixing times in Theorem 1.1 are of the right order, and in 
particular we prove (2). We do this by considering the total number of nonempty 
queues in the system. The idea is that this number is highly concentrated around its 
mean. In equilibrium the mean is An. On the other hand, if Xo = 0 and t < 01nn, 
then the expected number of nonempty queues is less than (A — 0)n for a suitable 
constant 0. Thus for such ¢ the two distributions are far apart. 

Now for the details. Consider two n-queue processes (X;) and (Y;), where 
XQ = 0 and (Y;) is in equilibrium. By Lemma 2.3, we can couple (X;) and (Y;) 
in such a way that always X; < Y; and so u(i, X;) < u(i, Y;) for each i. In par- 
ticular, if as before we let 4; (7) = E[u(i, X;)], then u;G) < u(i). (Recall that 
u(i) = E[u(i, Y)] = iE[Z(, Y;)].) Let s;() = u(i) — u(i), so s; (i) > 0. Also 
from (22) 


—— 2 (1 Elu(1, X01) — (0) — u2) 


du;(1) 
dt 


and 


0=A(1 —E[u(1, ¥,)“]) — (u(1) — u(2)), 


520 — M. J. LUCZAK AND C. McDIARMID 
so that 
ds; (1) 
qom HCL) AEC, Y^] — Eul, XD) + s Q). 


But, as in (29), given 0 € y < x < 1, we have 





xt — y! (x — y) H yt) « d(x — y). 
Hence, since we may assume that always u(1, X;) x u(1, Y;), we have 
E[u (1, Y)? —u(1, Xf] < d(u(1) — u(1)) = ds, (1). 
It follows [using also the fact that 5; (2) > 0] that for all t > 0, 


ds; (1 
OD s eX) 
dt 
Thus 
5, (1) > Ae (Pt 
for all times 1 > 0, since so(1) = u(1) =A. Thus if t < ix Inn — 1557 nina, 


then sı (t) > An^ V? In? n, that is, 
An — EUG, X;)] > An? In? n. 
But if t is O(Inn), then from (21) in the proof of Lemma 4.3, 
Pr(|£(1, X;) — ERC, XDJ) > n? mn? n) = e 2, 
Also, by the first part of Lemma 4.2, 
Pr(J£(1, ¥;) — An| > tAn? In? p) =e 9^. 
Inequality (2) now follows. 


7. Proof of Theorem 1.3. We assume throughout that the process is in equi- 
librium. For each time ¢ > 0 and each i = 0,1,... let the random variable Z;(i) 
be the number of new customers arriving during [0, 7] with position at least i on 
arrival. Let Jg == 0 and enumerate the arrival times after time 0 as J1, Jo,.... We 
define a “horizon” time tọ = In? n, and let N = [2Anto]. Note that £(1, Y;) is the 
number of nonempty queues at time f£. For each time f, let A; be the event l 


{An/2 x £(1, Y,) < 2Àn V s € [O, t]}. 


Then by Lemma 5.1, the event Ay holds a.a.s. 

We need two more preliminary results. The first one is an analogue of a special 
case of Lemma 2.1 in [7]. It may be proved quickly along the lines of the proof of 
that lemma; for completeness we give a proof here. 


SUPERMARKET MODEL 521 


LEMMA 7.1. Let (Xi) be in equilibrium. Let s, v > 0 and let a, b be nonneg- 
ative integers. Let 8 — n(Ads)**! /(b + 1)!. Then 


(31 PrlM, <a for some t € [0, t]| < (: + 1) (Pr(Mo <a+b)+5) 


and 


(32) Pr[M; 2 a -- b for some t €[0,t]| < (: + 1) (Pro >a)+ 6). 


PROOF. The j = L$] + 1 disjoint intervals [(r — 1)s,rs) for r = 1,..., j 
cover [0, t]. Let C, denote the event of having at least b + 1 arrivals in the in- 
terval [(r — 1)s, rs) which are placed into a single bin. Then 


Pr(C,) x nPr(Po(Ads) > b -- 1) x 6. 
But 


j j 
(M, <a for some t e [0, rh C (Ums samo (Uc) 


r=] 


and (31) follows. Similarly 


j-1 j 
(M, >a +b for some t e [0,7)} € (Um 2 U (Ù 2 
r=0 


and (32) follows. LJ 


We shall use the second lemma to bound "initial" effects. For each time t > 0 
let $; be the event that some initial customer survives at least to time f. 


LEMMA 7.2. Let 0 < X « 1 and let d be a positive integer. Let a = 
min(i In( 1), a}, so a > Q. Then for each positive integer n and each time t > Q, 


Pr(S;) x 2ne *'. 


PROOF. Letk = [t/4]. By (25) 
Pr(Mg > k +1) x nàF* «g^, 


Let Z be distributed like the sum of k independent service times. Thus Z has a 
gamma distribution, and has moment generating function E[e?^] = (1 — s)^* for 
$ < 1. So by Markov’s inequality, for each 0 < s < 1, 


Pr(Z»t)xe 5(1—5)* 


1 
—e 75 takings = 5 


2 ~t/4 
«G^ 
=\5 = 


522 M. J. LUCZAK AND C. MCDIARMID 


Hence 


Pr(S;) x Pr(Mo > k +1) - nPr(Z >t) <ni t pne <ne. [] 


Let i* = i*(n) be the smallest integer such that A -0/4—D < n-1/2 1p? n (see 
the discussion following Theorem 1.3). Note that i* = InInn/ Ind + O(1). 
We can handle the lower bound on M; easily. By Lemma 5.1, Pr(M < i* —1) = 


e- 9 (1^7). In particular M > i* — 1 a.a.s. Further, (31), with c — nf, a — i* — 3, 
b —1and s =n~*~?, shows that 


(33) min[M,:0 xt n*) —-i'—2 a.a.s., 
since 
8 =nPr(Po(Ads) > 2) <n(ads)* = O(n ?F 9). 


This result establishes the lower bound half of (3). 
The upper bounds on M; are less straightforward to prove. We consider first the 
easier case when d > 3. 


7.1. The case d > 3. We shall show that M < i* a.a.s. For k =0,1,..., let 
E, be the event that 2(i*, Yy,) < 2nl/? 1n? n; that is, at time (just after) Jy there 
are no more than 2n!/? In^ n queues with at least i* customers. Then Pr(E;,) = 


e72n’ n) by Lemma 5.1. Consider the customer who arrives at time Jg: on Eg—1, 
he has probability at most pı = (2n /? In n4 of joining a queue of length at 
least i*. Note that 


Pr(Jy.4 < to) € Pr[Po(Antg) > 2Ant] = e 9mm. 
Also, for each positive integer r, 
Pr(B(N, pi) =r) x (Np) = O((n "7D Ann) tY). 


Hence, for each positive integer r, 


Pr[Z4 G* 4-1) =r] 
AL 
<Pr(B(N, pi) >r)+ er LJ Kj + Pr(Jy41 < fo) 
k=0 


-— O ((n- 47-9 (Inn)?¢*?)"), 


Also, by Lemma 7.2 there exists a constant œ > O such that the probability that 
some “initial” customer has not departed by time tọ is at most 2ne ^"^, Hence, for 
each positive integer r, 


Pr[M > i* +r] x Pr[Z4 G* +1) >r]+2ne™™. 


SUPERMARKET MODEL 523 


In particular, Pr(M > i* + 1) = o(1), which together with the earlier result that 
M > i* — 1 a.a.s. completes the proof that M is concentrated on at most the two 
values i* and i* — 1. Also, 


(34) Pr[M > i* -2K 4-5] —o(n 5^. 


Now (32), with t — n*,a —i* -2K +5, b = [K/2] + 1 and s — n ^, lets us 
complete the proof of (3), since 


§ — nPr(Po(Ads) > b + 1) x n(Ads) E? = Q (nf), 


7.2. The case d — 2. The case d = 2 needs a little more effort, and uses the 
“drift” results from Section 7 in [7]. For convenience we restate these results here, 
as the next two lemmas. The first lemma concerns hitting times for a generalized 
random walk with “drift.” 


LEMMA 7.3. Letdo € $1 C --- C by bea filtration, and let Y1, Y2, ... , Ym be 
random variables taking values in (—1,0, 1} such that each Y; is q;j-measurable. 
Let Eo, Ei, ..., Eg be events where E; € Qi for each i, and let E =(); Ej. Let 
R; = Ro + $i; Yi. Let 0 < p x 1/3, let ro and rj be integers such that ro < ri 
and let pm > 2(r4 — ro). Assume that for each i = 1,...,m, 


Pr(Y; 21]|jj-1) 22p | on Ei N{Ri-1 «ril 
and 

Pr(Y; 2—1lji-)) p on Ej;-in(Ri-1«ri. 
Then 


Pr(En[R; «ri Vt € {1,...,m}}|Ro=r0) < e (- 57). 


The second lemma shows that if we try to cross an interval against the drift, then 
we will rarely succeed. 


LEMMA 7.4. Leta be a positive integer. Let p and q be reals withq > p>0 
and p+q <1. Let o & $1 € $2 C --- be a filtration, and let Yi, Y2, ... be ran- 
dom variables taking values in (—1,0, 1} such that each Y; is $;-measurable. Let 
Eo, E1,... be events where each E; € $;, and let E = (^; Ej. Let Ro = Q and let 
R; —-* Y; fork =1,2,.... Assume that for each i = 1,2,..., 


PrYi=1¢i-1)<p | onEj-.1n(0-zRi-ixa-1) 
and 


Pr(Y; 2—lldi-1) 2q on Eiji-430 {0 < Rj-1xa-— 1}. 


524 M. J. LUCZAK AND C. MCDIARMID 


Let 

T = inf{k > 1: R € (—1,a]]. 
Then 

Pr(E {Rr =a}) < (p/qY. 


Having stated the two "drift" lemmas, let us resume the proof for the case d — 2. 
Still let z = Inn and N = [2Antg |. We first show that M > i*, by showing that in 
fact 


(35) £(i*, Ya) 1n — aas. 

Let Jj = 0, and enumerate all jump times after time 0 (not just the arrival times) 
as Jj, J5, . ... For k =0,1,... let Ex be the event that £(i* — 1, Yy) > 3n? n? n; 
and let E, be the event that £(i*, Y z) £ 2ln°n. By Lemma 5.1 as before, 
Pr(Ex) = e 0 7, Let V, = £(*, Y jr) — £G*, Yy __) fork = 1,2, ..., so that 


k 
£(i*, Yy) = £(i*, Yo) + p» Vj. 
j=l 


Let p? =In*n/(25n).On Aj N Ex-1 N Ej, 
Pr(Vi = 1l y; .) = 2p 


and 
Pr(V, —lióy: .) < p2, 


for n sufficiently large. 
Also then np2 > 4ln?n for n sufficiently large. Hence by Lemma 7.3, a.a.s. 
£(i*, Y) > 21n? n for some i <n. Also J’ < tọ a.a.s., since 


Pr(J; > to) < Pr(J, > to) = Pr(Po(Anto) < n) = g Qon n) 


Thus a.a.s. £(i*, Y;)) > 2 In? n for some t c [0, to]. 

It now suffices to show that a.a.s. there will be no “excursions” that cross down- 
ward from [2 In? n] to at most in? n during the interval [0, c]. Let B be the event 
that there is such a crossing. The only possible start times for such a crossing are 
departure times during [0, fo], and a.a.s. there are at most N such times. We may 
use Lemma 7.4 to upper bound the probability that any given excursion leads to a 
crossing. Let a = fin’ n]. Let p = pz and let q = 2p2. We apply the Jemma with Y; 
replaced by — V; and with a, p and q as above, and with the events Ej = A Ji n Eg. 
We obtain 

N-1__ 
Pr(B) < N27? + »( U és) 4- o(1) — o(1). 
k=0 


SUPERMARKET MODEL 525 


Thus we have proved that M > i* a.a.s. 

We now consider upper bounds on M. We shall show that M <i* + 1 a.a.s., by 
showing that £(i* + 2, Yy) = 0 a.a.s. For k — 0,1,..., let Fy be the event that at 
the arrival time J; there are no more than 2n!/? In^ n queues of length at least i*. 
By Lemma 5.1 we have Pr(Fy) = e- 90), Consider the customer arriving at 
time Jg: on Fy_1 he has probability at most p3 = 41n^n/n of joining a queue of 
length at least i*. Thus for each positive integer r, 

N-—1 
Pr(Z;, (Gi*+1)> r) < Pr(B(N, pa)x r) + er J Fi) + Pr(Jy41 < to). 
k=0 
Also, by Lemma 7.2 the probability that some “initial” customer has not departed 
by time f9/2 is at most 2ne~“0/* for a suitable constant a > 0. Hence, there is a 
constant Č such that with probability 1 — e72 0) we have £(i* +1, Y;) <éIn°n 
uniformly for all t € [19/2, to]. Thus this also holds over [0, fo]. 

For k = 0,1,..., let F; be the event that £(i* + 1, Y} ) < E1n® n; that 1s, 
at time J; there are no more than činn queues with at least i* + 1 cus- 
tomers. On F, ,, the customer arriving at time J, has probability at most 
p4 = C (Inn)?n-? of joining a queue of length at least i* + 1. Then for each 
positive integer r, 

N-1 . 
Pr(Z,)(i* +2) >r) < Pr(BQN, pa) 2 r) + ( J Ri) + Pr(Jy44 < to). 
k=0 
Also, by Lemma 7.2, the probability that some initial customer stays until time fo 
is at most 2ne~”. It follows that a.a.s. Mọ <i* + 1, and 


(36) Pr(M, > i* +K 4-5) = O(n F^). 


Now the inequality (32) in Lemma 7.1 with t = n® a=zi*+K+5,b= [K/2] -1 
and s = n? yields the upper bound part of (3), since 


8 — nPr(Po(Ads) >b+1)< n(Ads)"*! = O(n-K ^), 


Finally, let us note one result which will be useful in [8]. Let the integer d — 2 
be fixed. Then, as in (36), if r = O (lan), 


(37) Pr(M > i*+14r) =e Q€Inm 


8. Concluding remarks. We have investigated the well-known “supermar- 
ket" model with n servers and a fixed number d of random choices. We have 
shown that the system converges rapidly to its equilibrium distribution. Our main 
result is that, for d > 2, in equilibrium the maximum length of a queue is a.a.s. con- 
centrated on two adjacent values close to InIn n/ Ind. In contrast, when d — 1 the 
maximum length of a queue is a.a.s. close to Inn/InInn. Along the way we used 


526 M. J. LUCZAK AND C. MCDIARMID 


the fact (27) that, in equilibrium, for each nonnegative integer k the proportion of 


queues of length at least k is close to A *47*2^ 

In [8] we use this last result together with mixing and concentration estimates 
(and upper bounds on the maximum length of a queue) obtained here to prove 
quantitative results on the convergence of the distribution of a queue length. In par- 
ticular, we give a quantitative version of the convergence result mentioned in the 
Introduction. Let (v; (k):k € N) be as in (1) above. It turns out that, for suitable 
initial conditions, uniformly over all t > 0 the distribution of a given queue length 
at time ż is close in total variation distance to the distribution of an integer-valued 
nonnegative random variable V, such that Pr(V, > k) = v (k). Also in [8] we in- 
vestigate the asymptotic independence of small subsets of queues, the “chaotic 
behavior’ of the system. 


REFERENCES 


[1] ETHIER, S. N. and KURTZ, T. G. (1986). Markov Processes: Characterization and Conver- 
gence. Wiley, New York. MR0838085 
[2] GRABAM, C. (2000). Kinetic limits for large communication networks. In Modelling in 
Applied Sciences (N. Bellomo and M. Pulvirenti, eds.) 317—370. Birkhauser, Boston. 
MR1763158 
[3] GRAHAM, C. (2000). Chaoticity on path space for a queueing network with selection of the 
shortest queue among several. J. Appl. Probab. 37 198-201. MR1761670 
[4] GRAHAM, C. (2004). Functional central limit theorems for a large network in which cus- 
tomers join the shortest of several queues. Probab. Theory Related Fields 131 97—120. 
MR2105045 
[5] GRIMMETT, G. R. and STIRZAKER, D. R. (2001). Probability and Random Processes, 3rd ed. 
Oxford Univ. Press. MR2059709 
[6] LUCZAK, M. J. (2003). A quantitative law of large numbers via exponential martingales. 
In Stochastic Inequalities and Applications (E. Giné, C. Houdré and D. Nualart, eds.) 
93—111. Birkhauser, Basel. MR2073429 
[7] LUCZAKE, M. J. and MCDIARMID, C. (2005). On the power of two choices: Balls and bins in 
continuous time. Ann. Appl. Probab. 15 1733-1764. MR2152243 
[8] LUCZAK, M. J. and MCDIARMID, C. (2005). Asymptotic distributions and chaos for the su- 
permarket model. Unpublislied manuscript. 
[9] LUCZAK, M. J. and MCDIARMID, C. (2005). Balls and bins in continuous time: Long-term 
asymptotics and chaos. Unpublished manuscript. 
[10] LUCZAK, M. J. and NORRIS, J. R. (2004). Strong approximation for the supermarket model. 
Ann. Appl. Probab. 15 2038—2061. MR2152252 
[11] MARTIN, J. B. and SUHOV, Y. M. (1999). Fast Jackson networks. Ann. Appl. Probab. 9 
854—870. MR1722285 i 
[12] MCDIARMID, C. (1998). Concentration. In Probabilistic Methods for Algorithmic Discrete 
Mathematics (M. Habib, C. McDiarmid, J. Ramirez and B. Reed, eds.) 195-248. Springer, 
Berlin. MR1678578 
[13] MITZENMACHER, M. (1996). Load balancing and density dependent jump Markov processes. 
In Proc. 37th Ann. Symp. Found. Comp. Sci. 213-222. IEEE Comput. Soc. Press, Los 
Alamitos, CA. MR1450619 
[14] MITZENMACHER, M. (1996). The power of two choices in randomized load-balancing. 
Ph.D. dissertation, Berkeley. 


SUPERMARKET MODEL 527 


[15] MITZENMACHER, M., RICHA, A. W. and SITARAMAN, R. (2001). The power of two random 
choices: A survey of techniques and results. In Handbook of Randomized Computing 
(S. Rajasekaran, P. M. Pardalos, J. H. Reif and J. D. P. Rolim, eds.) 1 255—312. Kluwer, 
Dordrecht. MR1966907 

[16] TURNER, S. R. E. (1998). The effect of increasing routing choice on resource pooling. Probab. 

... Engrg. Inform. Sci. 12 109—124. MR1492143 

[17] VVEDENSKAYA, N. D., DOBRUSHIN, R. L. and KARPELEVICH, F. I. (1996). Queueing sys- 
tem with selection of the shortest of two queues: An asymptotic approach. Problems In- 
form. Transmission 32 15-27. MR 1384927 


DEPARTMENT OF MATHEMATICS DEPARTMENT OF STATISTICS 
LONDON SCHOOL OF ECONOMICS UNIVERSITY OF OXFORD 
HOUGHTON STREET 1 SOUTH PARKS ROAD 
LONDON WC2A 2AB OXFORD OX13TG 

UNITED KINGDOM UNITED KINGDOM 


E-MAIL: m.j.luczak @1se.ac.uk E-MAIL: cmcd @stats.ox.ac.uk 


The Annals of Probability 

2006, Vol. 34, No. 2, 528-338 

DOI: 16.1214/0091 17905000000729 

© Institute of Mathematical Statistics, 2006 


THE SIZE OF COMPONENTS IN CONTINUUM 
NEAREST-NEIGHBOR GRAPHS 


Bv IVA KOZAKOVA, RONALD MEESTER AND SEEMA NANDA 


Vrije Universiteit Amsterdam, Vrije Universiteit Amsterdam 
and University of Tennessee 


We study the size of connected components of random nearest-neighbor 
graphs with vertex set the points of a homogeneous Poisson point process 
in R@. The connectivity function is shown to decay superexponentially, and 
we identify the exact exponent. From this we also obtain the decay rate of 
the maximal number of points of a path through the origin. We define the 
generation number of a point in a component and establish its asymptotic 
distribution as the dimension d tends to infinity. 


1. Basic definitions and results. Let X be a homogeneous density 1 Poisson 
process in R? with an “extra” Poisson point located at the origin. Let $4 denote 
the directed graph whose vertices are the Poisson points and in which there is 
a directed edge from s € X to s'e X if s’ is the nearest neighbor (NN) of s. 
Ignoring the directions of the edges leads to an undirected graph which we denote 
by just $4. 

The graph $4 was introduced and studied in [1]. They showed that $4 contains 
a.s. no unbounded component, in any dimension d. In the current paper, we are 
interested in the tail behavior of components in $4; rather than only stating that 
they are finite, we would like to know how large these clusters typically are. To 
this end, we make a few definitions. . 

We denote by pa4(n, L) the probability that there is a directed path in $4 starting 
at the origin, touching exactly n distinct points (besides the origin) and ending at a 
point s with |s| > L (where | - | denotes Euclidean distance). Furthermore, tg(L) 
is the probability that there is a path in $4 starting at the origin and ending at a 
point s with |s] > L. 

Since in a Poisson process the distances between any two pairs of points are 
different a.s., each point has an a.s. unique NN. It is quite possible that two Poisson 
points are each other’s NN. In this case there is a mini-loop of two directed edges 
in G4 between these two NN's. In fact, since each Poisson point must have an NN, 
and all components are finite, each component contains at least one such closed 
mini-loop. This would happen when a Poisson point s is an NN of one or more 


Received September 2004; revised April 2005. 

AMS 2000 subject classifications. 60K35, 60G55, 60D05. 

Key words and phrases. Continuum percolation, Poisson process, size of components, nearest- 
neighbor connections, random graphs. 


528 


CONTINUUM NEAREST-NEIGHBOR GRAPHS 529 


other Poisson points, and when one of the latter is an NN of s as well. Furthermore, 
each component contains exactly one closed mini-loop, as the existence of more 
than one such closed mini-loop would imply the existence of more than one NN 
for some Poisson point in the component. 

It is not possible to have any type of closed circuit other than the one described 
here, as the existence of any closed circuit involving more than two Poisson points 
would contradict the fact that the lengths of successive directed edges are decreas- 
ing. The existence of an (a.s.) unique NN implies that along any path in $4 there is 
at most one change of direction (in the arrows of 64), and the change of direction 
must take place at one of the two Poisson points in a loop. 

We finally mention that it is a standard fact that there is a uniform and finite 
upper bound, depending on the dimension d, for the number of points that have 
the same point as their nearest neighbor. The maximum number of such points is 
called the kissing number and denoted by Kg < oo; see, for instance, [3]. 

There are at least two ways to measure the size of a component in $4: one can 
look at the diameter of the component, or at the number of points in the component. 
First we state a result that tells us that the diameter of a component decays faster 
than exponentially, and which specifies the exponent exactly. For a related result 
in a discrete setting, see [2]. 


THEOREM 1.1. There exist constants C1, Co, Lo € (0, o0) (depending only 
on the dimension d) such that 


—Dyd = 
(1) e Ci Log L4 D/ < aL) < e" CaL (log L)@ ue for L > Lo. 


Turning to the number of points in a component, we let pg(n) denote the prob- 
ability that there 1s a path in G4 through the origin touching more than n distinct 
points. 


THEOREM 1.2. There exist constants C1, C5, No € (0, oco) (depending only 
on d) such that 


nt nt 
e Cinlogn < pa(n) x e C5nlogn forn No. 


For fixed dimension, it seems difficult to make more precise statements about 
the number of points in a component. Therefore, we investigate what happens in 
the limit when the dimension d — oo, and we will obtain some indirect informa- 
tion about the size of a component via the so-called generation number of a point. 
We already remarked that any component contains exactly one mini-loop of two 
arrows. The points in this mini-loop are given generation number 1. À Poisson 
point x receives generation number k if the graph distance to the unique mini-loop 
in the component of x is equal to k — 1. We denote by ga(K) the probability that 
the origin has generation number k, in dimension d. The following result is an 
indication that components typically are very small. 


530 I. KOZAKOVA, R. MEESTER AND S. NANDA 


THEOREM 1.3. We have that 


k 
li k) = ———— — diu 
jim ga(K) Dr k — 1,2, 


Section 2 contains the proofs of Theorems 1.1 and 1.2, while Section 3 contains 
the proof of Theorem 1.3. 


2. Proofs of Theorems 1.1 and 1.2. In this section W ),..., W, are i.i.d. 
IR-valued random variables whose common probability density is given by 


e- V((,1wD) — g~7alvl" The lower bound is based on the following. Note that 
t4 (L) > pa (n, L) for every n. 


PROPOSITION 2.1. There exist constants b,,c; € (0, co) (depending only 
on d) such that 


(2) pa(n, L) > —— e D e ein Gay 


As will be seen from the proof of the proposition, one may choose any 
0 € (0, 7] and take b; to be the probability that a uniformly distributed random 


point on the unit sphere S¢~! falls into the “polar cap" of opening half-angle 0. 


The corresponding c; may then be taken as — "TL where xa = V(B(0, 1)), where 


V denotes Euclidean volume and B(x, r) denotes the (open) ball of radius r cen- 
tered at x € RÊ. 

For the upper bound we need to bound not only pa (n, L) but also some closely 
related quantities that we now define. 

For j € {0,...,n} we define pa(n, L, j) to be the probability of the event 
E (n, L, j), that there are two directed paths in $4: one from 0 to some s’, touching 
exactly j particles (besides 0) and one from some s to the same s’, touching exactly 
n — j particles (besides 5^) and such that |s| > L. Thus pa(n, L) = pa(n, L,n) 
and furthermore (by the properties of the directed graph $ mentioned above and) 
since tg(L) is equal to the probability of UP. Uj 41 E(n, L, j), we have 





(3) ta(L) < 3 3 pa(n, L, j). 


n=] j=0 
Fhe upper bound is based on the following: 


PROPOSITION 2.2. There exists a constant cz € (0, oo) (depending only on d) 
such that 


pa(n, L, j) <- a Kay Fie cae -+ Wal = coL) 
(4) Ks)" 
«LE P((Wil +--+ Wal coL), 


jin — j)! 


CONTINUUM NEAREST-NEIGHBOR GRAPHS 531 


where Wi, ..., Wn are i.i.d. R?-valued random variables as described earlier. 
As a corollary to Proposition 2.2 and inequality (3), we have the following. 


PROPOSITION 2.3. z4(L) < eK P( > cL), where U is a random variable 
with E(e'*) = e2K Ee" 1-1) 


PROOF OF PROPOSITION 2.1. Given (xi,..., Xn) € (R2)" we define sp = 0 
and s; = x1 +-+: +x fori =1,...,n. Let the set of points (x1,..., Xn) € (R4)? 
satisfying the following three conditions be denoted by 4: 


Q) xil > |x2] > -+ > [xal], 
(ii) [54] > L, 
(ii) sj ¢ U i Bj fori = 1,...,n, where Bj = B(sj..1,|xj|) is the open ball 
centered at s;..1 of radius [xj]. 


Note that [because of (1)] condition (iii) may be replaced by: 
(ii) Bj Y(so,..., Sn} — sj.1 forl = 1,...,n. 


We now claim that 
(5) pa(n, L) d CR BP dxy +++ dxy. 


Since we use this type of equality (a variation on Campbell's theorem) a num- 
ber of times, and since we have not been able to find a proof in the litera- 
ture, we spend a few lines on the proof of (5). Our proof proceeds by a suit- 
able discretization of R7. For k = 1,2,..., we consider (nested) subdivisions of 
R? into d-dimensional cubes of side length 2~*. We denote by D(8) the col- 
lection of n-tuples (DE, ..., DF) of these cubes with the property that for all 
$1 ED essas € Dk we have (51,...,54,) € 4. 

Denoting the event in question by E, we denote by Ey the event that in addition, 
all points 51, ..., Sn of the directed path are the only Poisson points in their respec- 
tive cubes Df, ..., DK, of side length 2—* and such that (DF, ..., D^) € D(4). Itis 
clear that Ey — E and that P (Ep) — P(E), as k — oo. We therefore need to com- 
pute limz., o5 P (Ex). Denoting the integrand in (5) by f, we write f (DF, TM D*) 
for the expectation of f (X1,..., Xn) where tbe X;'s are independent and uniform 
over Dj, respectively, i = 1,...,n. 

Note that E; occurs when there is an n-tuple (DE, ..., DH € D(48) such that 
each box Dé of side length 2-* in this sequence contains exactly one Poisson 
point, such that in addition, the correct balls around these points contain no further 


532 I. KOZAKOVA, R. MEESTER AND S. NANDA 


Poisson point. This gives 


P(Ej) = YS (ZE +o" Fo... D®) 
(DE... DE}ED(4) 


= Qk Sf h,..., DD +o) 
(Df, ..., DE)e (4) 


Br [ f 1, ---, Xn) dx, -+-dxn, 


as k — oo, proving (5). 
Now let Q(0) be the polar cap of half angle 0 < 7 in the unit sphere of R? with 
vertex at the origin; that is, 


@(6) = {x = (x1,..., xq) € E : |x| = 1 and x > cos6}. 
Since 
n nn 
(6) v( U hj) < Vy, 
j=l j=l 


it follows [using the notation of Proposition 2.2, letting S; = Wi + --- 4- W; and 
B; = B(Sj1, |W;D] that 


pa(n,L)> | [Tene dx, ... dx, 
$ j=l 


-r(mus- -> [W,], 








n 
2 
F= 


H > Land S; ¢ U B; for each i) 


ed 





> (mi a 





2 Wi = 
j-1 





> won" P( im >--+>|Wnl, >> |Wylcos(@) > L) 


j=l 


= OO" p (Sim as] P [r(miz ——)] 


3 DOr i (cos8)*)n(L/ny 
n! 











Here b(@) represents the probability that S; lies in a cone of half angle 0 < 7 


with vertex at 5S;..;. The result now follows from the approximation BOY Re 
enh(og(b(@))—lognt+}) p] 


CONTINUUM NEAREST-NEIGHBOR GRAPHS 533 


To prove Proposition 2.2 we need Lemma 2.4 below which should be compared 
to (6). As a replacement for (5), one can straightforwardly obtain the inequality 


j agn / 
(7) pa(n, L, j) € | e^ Vint BIJUI- BD ay eue diy. 
$j 
Here B! = B(s;, |x;|) and 4; is the set of points (x1, ..., x4) € (R)" satisfying 
the three conditions (where again so = 0 and s; =x; +--+: d- xi 
(i) [xy] > --- > [xj] and ]xj4il S: X xal, 
(ii) [sp] = L, 
Gii) Bi 59, ..., Sn} = 87-1 for l = 1,..., j and Bi N (50, ..., Sn] — s; fori = 
J+l,...,n 


LEMMA 2.4. Let (x1,...,Xn) € 4j. Then 


(8) «(Un U U a)» x. vao 3 25 va 


i-i i=j+1 i=j+i 


PROOF. This is a consequence of the fact that no point x can lie in the inter- 
section of more than Kg of the balls B;. To see this, note that if this were not so, 
then x would be the closest point (among (x, q1,..., q4]) to more than Kg of the 
qi'S, contradicting the definition of the kissing number Kg. O 


PROOF OF PROPOSITION 2.2. From now on we call Kg merely K. Starting 
with (7), we apply Lemma 2.4 to find 


Js, 
_ f 2k Y bal? e Ea Lis ja bi j dxi ---dXpn 
4j 


n 
= | I] e OQ Kyal gy, dy. 
By the change of variable x; = y; K“/® for each i, the last integral becomes 


(9) ef [Def ay «dy, 
j j=l 


where 4^ is the set of (y1, ..., Yn) € (R4)" such that (KG/Dy) 1. KODY) € $j. 
To get an upper bound on (9) we simply drop the third condition in the definition 
of 4j; that is, we replace 8 f with 4 j , the set of (y1,..., Yn) satisfying only the two 
conditions: 


534 I. KOZAKOVA, R. MEESTER AND S. NANDA 


G) il>- > |yjl and |yjzil S: x [ynl, 
(i) | 7, Ky] L. 


This yields the bound 





pa(n, L, DSa 5i — ub. j pi 
EL 


bO 


PROOF OF PROPOSITION 2.3. From (3) and (4) we have 


ta(L) < n Y xm PM + +--+ Wal > cL) 
n=] j=0/ 


= E as P(W; 4 --- + Wal > coL) 





n=l 


(10) 
2K 
eK 3 ap ED 4 — — P(|Wi| +--+ H Wal c2L) 
nz: 
= e P(U > cL). 


The last equality follows by taking U to be a compound Poisson random variable 
[Wi|+----+|Wa| (where N is independent of |W,|, |W], ... and is Poisson with 
mean 2K ). By a standard calculation, 


(11) E(e'") = eK E -1) 
This completes the proof of Proposition 2.3. 0] 


PROOF OF THEOREM 1.1. Takingn % aon in (2) we get the lower bound 
in (1). 


For the upper bound we use large deviation bounds on P (U > c2L): 


P(U > coL) < exp | inf (log(E(e"™ )yss roL)| 
(12) — exp | inf (2K E(z Ph -1)- ran] 


< exp | inggar (e ~- real)}}. 


The last inequality follows from the fact that E(e'W1l) < e°" ee for large r 


(as can easily be shown). Taking r ^: a (log L)@~)/4 for an appropriate constant 
o we get z4(L) < e- CaL 00g DC for large L. O 


CONTINUUM NEAREST-NEIGHBOR GRAPHS 535 


PROOF OF THEOREM 1.2. The lower bound follows from Proposition 2.1 
by taking L = 0. For the upper bound, we note that in order for a path with the 
required properties to exist, there has to be a path from the origin, touching at least 
L5] points with at most one change in direction. Using Proposition 2.2, this leads 
to 


In/2] : 
pa (n) < ^» pa(| > 9.7) 
j=0 
: 2 ln? 
«$e J1(n/2] — D! 
| QK4) 
= [7/2]! 


cy e”/2)log@Ka)—(n/2)logn—-log2—1) 


< e logn 


for large n, proving the result. C 


3. Proof of Theorem 1.3. We shall need the following simple fact. Define 
La(a, b, y) = V(B(s, a) n B(t, b)), 


where s and ¢ are two points in IR at distance y. 


PROPOSITION 3.1. For each fixed y > 1 we have 


La(1, y, y) 
d ALES NM — 
Td 


0, 
as d — co. 


PROOF. Note that L4(1, y, y) is a lens contained in a cylinder of radius r = 
r(y) < 1 and height 1. The volume of this cylinder is x4 r^ — o(z4). O 


PROOF OF THEOREM 1.3. For notational convenience, we write Bj; for 
B (si, |x;|), where as before x; = 5; — sj—1, for j — 1,2, .... 

The set of points (s1, ..., sk) € (IRZ)* satisfying |xi| > |x2| > --- > |x} is de- 
noted by Ug. Furthermore, the subset of Ug which in addition satisfies 


i 
Si € J Bnei 


m-1 


for i =2,..., 7 is denoted by Ui, for j = 2,...,k. For convenience we define 


536 I. KOZAKOVA, R. MEESTER AND S. NANDA 


^ = Uk. As in Section 2 we may now write 
ga(k) = J e V (Bo) ọ— V (B1,2\B0, 1) &— V (B2,3\(Bo,1UB1,2)) 
$—(s1,...,Sk)€ Uu, 


xx e V Gia Br NUI, Bii) ds. 


We now first estimate this from below, and after that we show that the error 
we make by doing this, tends to zero when the dimension tends to infinity. The 
first step is to replace the volumes in the exponents by the volume we would get 
without subtracting anything from the first set mentioned in each exponent. Thus 


03) gaz | g Tall g—maleal! rali e-mail gy 
5—(51,..,5k)€ Ur, 


Writing the integrand in this formula as W (s), the next step is to rewrite this as 


k 
(14) fa W(s)ds — 2. J, W(s)ds, 


ja Mj 


that is, we have an integral over Ug as our leading term, and subtract from this the 
integral over those sequences s which have a first index j for which s; falls into a 
previous ball, j — 2, ...,k. 

To compute the leading term, note that a simple change of variables gives that 


(15) J W(s)ds = | eg... e 1e 72 dy, ... dy. 
Uk O<YkSYyk-1 5 <1 


This integral can be computed explicitly, but its value is most easily found and 
understood via a simple probabilistic interpretation. Indeed, the integral is equal to 
1/2 times the probability that Y; > Yo >--- > Yg, where the Y;'s are independent, 
where Y1, Yo,..., Yz..; are standard exponentially distributed and where Y, has an 
exponential distribution with parameter 2. The probability that Y1, Yo, ..., Yi. 4 
are ordered this way is just 1/(k — 1)!. Since Y; has the same distribution as 
the minimum of two independent exponentially distributed random variables, the 
probability that one of these two will be the smallest among the k + 1 random 
variables in question is simply 2/(k + 1). It follows that the integral is equal to 


I 1l 2 k 
2(k-D!(k+1)  (k4 DU 


Next we will show that the remaining terms in (14) tend to O as the dimension 
d tends to infinity. For this we need to bound 


i = Wọs)ds, 
ULM 











CONTINUUM NEAREST-NEIGHBOR GRAPHS 537 


for j — 2, ..., Kk. For given j, this integral is over those sequences for which j is 
the first index so that s; falls into a previous ball. Hence 


Í l , W(s) ds 
Ui 


<| a. gun. | g "alii 
s, eR 8$2€D,,1 $j-1€Bj—2,j—2 


oa jd 
x | "T e "dil ds,... dsi 
$;€Bj i, j- 10 Jj. Bii 


< J e 7a e Talal E g "aja 
s €IRd $2€ Bj | 8j .1€Bj—2,j—2 


ji-1 
$335 VGj ij N Bii) 


ad 
e "4l dsj ds, 
V(Bj-1,j-1) sj€Bj—1,j-1 


where in the last inequality we have used the fact that if f(|x|) is decreasing in 
|x|, then for y with |y| > 1 itis the case that 


V(B(0, DN BG. ly) 
V(B(0, 1) x€B(0,1) 
The volume in the numerator can be estimated by the volume of the largest possible 


intersection, which is the one with the first ball in the situation that s;..; lies on its 
boundary: 


f(lx)dx < f (xp dx. 


| eR 


F— | 
$17, V(Bj-1,j-1 N Biz) Lax jl, lxil, lxi) 


mt en) 














V(Bj-1,j—1) V(B(0, |xj iD) 
|—] 
=F 1,(1 EORUM J 
Td Xj—1 Xj—1 


We now continue our estimate as follows, using the change of variables y; — 
d 
74|xi ^: 


Í ; , W(s)ds 
VM 














N d z a d T 
< J e "aal day |x, 47! J e TAA drglxatTt... 
|x1|»0 Ix2| xxi | 
ode. st ag -—-1) X] x 
x | e "abrjal diq\xj—1|4 ig - D L1. erus ) 
Ix j-11xlxj-2l Jd Xj—1l |Xj-1 


= jd = 
x J e 7493" drglxj| t dix;|--- dixal dixil 
Ix jx jail 


538 I. KOZAKOVA, R. MEESTER AND S. NANDA 


(j —1) y \ p y, Md 
= n2) 65) ) 
Oxyjzxyi Md Jj-1 Yj-1 


x soo "dy; M -dyj 
L] l 
<| (J "Lr. sh "E ři dy; . -- dyi, 
Oxyjz-Xyp Md yj-1 Yj-1 


which tends to zero according to Proposition 3.1 and dominated convergence. 
It follows from all this that 








k 
m ga(k) = Gr 


The final observation is the following: since the sum over k of k/(k + 1)! is equal 
to 1, the inequality is in fact an equality, and we conclude that 
Kx d oves 


| k 
jm, 8400 = v yr 


proving the theorem. O 


Acknowledgments. We thank Charles Newman for useful conversations. We 
also thank an anonymous referee for his or her very careful reading of the manu- 
script, which greatly improved this paper. 


REFERENCES 


[1] HAGGSTROM, O. and MEESTER, R. (1996). Nearest neighbour and hard sphere models in con- 
tinuum percolation. Random Structures Algorithms 9 295—315. MR1606845 

[2] NANDA, S. and NEWMAN, C. M. (1999). Random nearest neighbor and influence graphs on Z4, 
Random Structures Algorithms 15 262—278. MR1716765 

[3] ZoNG, C. (1998). The kissing numbers of convex bodies—A brief survey. Bull. London Math. 
Soc. 30 1-10. MR1479030 


I. KOZAKOVA : S. NANDA 


R. MEESTER TIFR CENTRE 

DEPARTMENT OF MATHEMATICS P.O. Box 1234 

VRIJE UNIVERSITEIT AMSTERDAM IISc CAMPUS 

DE BOELELAAN 1081 BANGALORE 560012 

1081 HV AMSTERDAM INDIA 

THE NETHERLANDS E-MAIL: nanda@math.tifrbng.res.in 


E-MAIL: rmeester@cs.vu.nl 


The Annals of Probability 

2006, Vol. 34, No. 2, 539-576 

DOT: 10.1214/009117905000000602 

© Institute of Mathematical Statistics, 2006 


DYNAMICAL STABILITY OF PERCOLATION FOR SOME 
INTERACTING PARTICLE SYSTEMS AND e-MOVABILITY 


By ERIK I. BROMAN! AND JEFFREY E. STEIF? 
Chalmers University of Technology 


In this paper we will investigate dynamic stability of percolation for the 
stochastic Ising model and the contact process. We also introduce the no- 
tion of downward and upward e-movability which will be a key tool for our 
analysis. 


1. Introduction. Consider bond percolation on an infinite connected locally 
finite graph G, where, for some p € [0, 1], each edge (bond) of G is, independently 
of all others, open with probability p and closed with probability 1 — p. Write xp 
for this product measure. The main questions in percolation theory (see [10]) deal 
with the possible existence of infinite connected components (clusters) in the ran- 
dom subgraph of G consisting of all sites and all open edges. Write C for the event 
that there exists such an infinite cluster. By Kolmogorov's 0-1 law, the probability 
of C is, for fixed G and p, either 0 or 1. Since xp(C) is nondecreasing in p, there 
exists a critical probability pe = pe(G) € [0, 1] such that 


|. 49, for p < pe, 
7,6) - [1 for p > pe. 


At p = pe, we can have either z5,(€) = 0 or z; 5 (C) = 1, depending on G. 

In [15] the authors initiated the study of dynamical percolation. In this model, 
with p fixed, the edges of G switch back and forth according to independent 2 
state Markov chains where 0 switches to 1 at rate p and 1 switches to O at rate 
] — p. In this way, if we start with distribution 7t», the distribution of the system is 
at all times zp. The general question studied in [15] was whether there could exist 
atypical times at which the percolation structure looks different than at a fixed 
time. 

We record here some of the results from [15]; (1) for any graph G and for any 
P < pc(G), there are no times at which percolation occurs, (ii) for any graph G 
and for any p > p.(G), there are no times at which percolation does not occur, 
(iii) there exist graphs which do not percolate for p = p. (G), but, nonetheless, for 


Received June 2004; revised January 2005. 
l Supported in part by the Swedish Natural Science Research Council. 
? Supported in part by the Swedish Natural Science Research Council, NSF Grant DMS-01-0384 
and the Góran Gustafsson Foundation (KVA). 
AMS 2000 subject classifications. 82C43, 82B43, 60K35. 


Key words and phrases. Percolation, stochastic Ising models, contact process. 


539 


540 E. I. BROMAN AND J. E. STEIF 


p = p-(G), there are exceptional times at which percolation occurs, (iv) there exist 
graphs which percolate for p = p. (G), but, nonetheless, for p = p,(G), there are 
exceptional times at which percolation does not occur, and (v) for Zi with d > 19 
with p = p. (Z4), there are no times at which percolation occurs. In addition, it 
has recently been shown in [23] that, for site percolation on the triangular lattice, 
for p = Pe = 1/2, there are exceptional times at which percolation occurs. Given 
this, a similar result would be expected for Z^. 

The point of the present paper is to initiate a study of dynamical percolation 
for interacting systems where the edges or sites flip at rates which depend on the 
neighbors. We point out that in a different direction such questions in continuous 
space, but without interactions, related to continuum percolation have been studied 
in [2]. 


Ising model results. Precise definitions of the following Ising model measures 
and the stochastic Ising model will be given in Section 2. Fix an infinite graph G — 
(S, E). Let w+?" be the plus state for the Ising model with inverse temperature f 
and external field ^ on G [this is a probability measure on (—1, 1}°]. Let vt?” 
denote the corresponding stochastic Ising model; [this is a stationary continuous 
time Markov chain on (—1, 1)? with marginal distribution ut:®*]. Let Ct (C7) 
denote the event that there exists an infinite cluster of sites with spin 1 (—1) and 
let G;^ (G; ) denote the event that there exists an infinite cluster of sites with spin 
] (—1) at time t. It is known that the family 14**P^^ is, for fixed f, stochastically 
increasing (to be defined later) in h. 


THEOREM 1.1. Consider a graph G = (S, E) of bounded degree..Fix B > 0 
and let he = h,(B) be defined by 


he :— inf(h : y P^ (@*) = 1). 

Then for all h > he, 
y ^P A^ (ot occurs for every t) -1 

and for all h < he, 

y P^ ar 0: CF occurs) =Q. 
If we modify h, to be instead 

hi, := sup{h : iP (97) = 1}, 
the same two claims hold with CF replaced by C7 and with h < h', and h > hi, 


reversed. 


This result tells us what happens in the subcritical and supercritical cases (with 
respect to h with 8 held fixed). It is the analogue of the easier Proposition 1.1 
in [15] where it is proved that if p « pc (p > pc), then, with probability 1, there is 
percolation at no time (at all times). 

The following easy lemma gives us information about when A, is nontrivial. 


DYNAMICAL STABILITY FOR IPS 541 


LEMMA 1.2. Assume the graph G has bounded degree and let B be arbitrary. 
Then he > —oo. If p-(site) < 1, then ho < oo. Similar results hold if h; is replaced 
by hi.. 


The following theorems, where we restrict to Z7, will only discuss the case 
h = 0. However, this will in many cases give us information about the “critical” 
case (B, h.(B)) since, in a number of situations, 5h. (8) = 0. For example, this is 
true on all Z? with d > 2 and f sufficiently large. We also mention that while the 
relationship between he and A’, in Theorem 1.1 might in general be complicated, 
for ZZ , one easily has that Ae = —h‘,; this follows from the known fact that the plus 
and minus states are the same when h Æ 0. When h = 0, we will abbreviate j,7:5:0 
by wt? and WV ^9 by y^P. We point out that while +?” is stochastically 
increasing in h for fixed f, there is no such monotonicity in f for fixed h, not even 
for h = 0. Therefore, we must use a different approach in the latter case. 

We first study percolation of —1’s and then percolation of 1’s. Let 


l cc NC NE log3 
BpQ):— ntle ut 172 Pl < x = =. 
We will refer to 8,5(2) as the critical inverse temperature of the Peierls regime 
for Z2. The choice of f p (2) might at first look quite arbitrary, but it is exactly what 
is needed to carry out a contour argument (known as Peierls argument) for Z^. For 
d > 3, there is a By (d), such that, for B larger than B, (d), a similar (although topo- 
logically more complicated) argument works for Z^. As a result of this “contour 
argument,” it is well known and easy to show that, for B > B, (d), we have that 


(1) u^? (67) =0. 
Our next result is a dynamical version of (1) and we emphasize that this corre- 
sponds to the critical case as it is easy to check that, for these B's, A. (f) = 0. 
THEOREM 1.3. For Z? with d > 2 and B > B,(d), 
y^P (3t » 0:67 occurs) — 0. 
It is well known that 85(d) > Bc(d), the latter being the critical inverse temper- 
ature for the Ising model on Z4. For d = 2, Theorem 1.3 can be extended down 


to the critical inverse temperature f), (2). First, it is known (see [5]) that on Z2, for 
all p, 


(2) ut’ (C7) =0. 


Our dynamical analogue for B > ße is the following where we again point out 
that this is also a critical case as it is easy to check that, for these £'s, we also have 


hc(B) = 0. 


542 E. L BROMAN AND J. E. STEIF 


THEOREM 1.4. For the stochastic Ising model WP on Z? with parameter 


B > Be, 
wt P(at>0: C, occurs) = 0. 


Interestingly, (1) is not always true for B > 6-(d), although, as stated, it is true 
for Z? or f sufficiently large. In [1] it is shown that for ZZ with large d, there exists 
Bt > B-(d) such that the probability in (1) is, in fact, 1 for all 8 < 8*. Moreover, 
they show that, for these £, there exists h > 0 with 


p o^ On =l. 


For such fs, this means that A; > 0 and, hence, it immediately follows from The- 
orem 1.1 that 


y ^e (C; occurs for every t) = 1. 


Note that, for these values of f, the case h = 0 is a noncritical case. 

We next look at percolation of 1’s under 1,772. In the above results, we have not 
discussed the case of percolation of —1's when B < fe. However, by symmetry, 
this is the same as studying percolation of 1’s in this case and so we can now move 
over to the study of C+. 

First, it is well known that, for any graph of bounded degree, 1/9 4 uP” 
implies that wt Bah (C+) = 1. (This is proved in [3] for Zi; this argument extends 
to any graph of bounded degree.) In particular, for any graph G of bounded degree 
and for B > B.(G), 


(3) pt Pet) 21. 


Our next result is a dynamical version of (3) for ZZ. We mention that this result 
sometimes corresponds to a critical case and sometimes not. For f > Bp(d) in Zê 
or B > B.(2) in Z2, we have seen that he = 0 and so, in these cases, this next result 
covers the critical case. However, as pointed out, for d large and f just a little 
higher than c, the result in [1] gives us that ke < 0 and, hence, in this case, this 
next theorem already follows from Theorem 1.1. 


THEOREM 1.5. For the stochastic Ising model V P on Z? with parameter 


P > Bc (a), 


yP (C$ occurs for every t)=1. 


(The proof we give actually works for any graph of bounded degree.) We men- 
tion that while B > ße is a sufficient condition for (3) to hold, it is certainly not 
necessary. For example, on Z? we have that j/ ^9 (8 *) = 1 since u+? = 7/5 and 
the critical value for site percolation on Z? is less than 1/2. The reason f appears 


DYNAMICAL STABILITY FOR IPS 543 


is the connection between the Ising model and the random cluster model; Be cor- 
responds to the critical value for percolation in the corresponding random cluster 
model (see [13]). 

We are now left with the case P < ße. We will not be able to say too much since 
it is not known in all cases whether one has percolation at a fixed time. We first, 
however, have the following easy result for d > 3. We do not prove this result since 
it follows easily from the fact that the critical value for site percolation on Z4 is 
less than 1/2 for d > 3, as this gives easily that h,(8) < 0 for P sufficiently small 
and, hence, Theorem 1.1 is applicable. 

Note that the case 8 = 0 follows from the result in [15] mentioned above. 


PROPOSITION 1.6. For d > 3, there exists Bi(d) > 0 such that, for all 

P < fi(d), we have that 
wre (Gf occurs for every t) — 1. 

Finally, due to work of Higuchi, we can determine what happens with B < fe 
for Z^. It is shown in [16] that, for Z?, for all B < ße, we have that h,(B) > 0. The 
following result follows from this fact and Theorem 1.1. 

THEOREM 1.7. Ford —2, for all B < Bec, we have that 

wt? (3t » 0:Gf occurs) — 0. 

We note that even though it is known that for Z*, ut’: (Ct) = 0, we cannot 

conclude that 

wt Peat » 0: Gf occurs) = 0, 
since it is known (see [17]) that A. (ße) = 0. In contrast, based on the results in [23], 
it is interesting to ask the following: 

QUESTION 1.8. For the graph Z?, is it the case that 

wt Pe (3t > 0: G; occurs) = 1? 


We finally mention that, interestingly, it is also known (see again [17]) that, for 
B < Bo, u PD Ce) — 0. 


Contact process results. Precise definitions of the following items will be 
given in Section 2. Fix an infinite graph G = (S, E). Consider the contact process 
on G = (S, E) with parameter A. Denote by jz), the stochastically largest invariant 
measure, the so-called "upper invariant measure" (this is a probability measure on 
(0, 1}°). Let w^ denote the corresponding stationary contact process (this is a sta- 
tionary continuous time Markov chain on (0, 1]? with marginal distribution u1). 


544 E. I. BROMAN AND J. E. STEIF 


If 0 < Ay < Az, it is well known that 143, is stochastically smaller than j),, de- 
noted by 


MA < Hia 
(see Section 2 for this precise definition). 


THEOREM 1.9. Consider the contact process W^ on a graph G = (S, E), 
with initial and stationary distribution p). Let X. be defined by 


Àp i= inf{A:,(C*) = 1). 
We have that, for all X > Xy, 


V^(G;" occurs for every t) — 1. 


In order for this theorem to be nonvacuous, we need to know that A p < © 
for at least some graph. First, the fact that there exists A such that u, (G7) > 0 
for T? with d > 2 follows from [12]. Here T? is the unique infinite connected 
graph without circuits and in which each site has exactly d + 1 neighbors; T? is 
commonly known as the homogenous tree of order d. Combined with a 0—1 law 
which we develop, Proposition 4.2, we obtain that A < oo in this case. For Z^ 
with d > 2 (as well as for T4), it is proved in [22] that, for large A, p) stochastically 
dominates high density product measures, which immediately implies that A, < oo 
in these cases. 

When we prove Theorem 1.1, we will, in fact, prove a more general theorem 
which holds for a large class of systems. However, this proof will only work for 
models satisfying the so-called FKG lattice condition (which we call *monotone" 
in this paper). We now point out the important fact that, for à < 2, in 1 dimension, 
the upper invariant measure for the contact process, while having positive correla- 
tions, is not monotone (see [20]). These terms are defined in Section 2. (One would 
also believe it is never monotone whenever the measure is not d9.) Hence, Theo- 
rem 1.9 does not follow from the generalization of Theorem 1.1 which will come 
later. 


e-movability We now introduce the concepts of upward and downward 
e-movability. While we mainly introduce these as a technical tool to be used in 
our main results, it turns out that they are of interest in their own right. In [4] the 
concept of upward movability is studied for its own sake and is related to other 
well studied concepts, such as uniform insertion tolerance. 

Let S be a countable set. Take any probability measure u on (—1, 1]? and 
let X be a (—1, 1)? valued random variable with distribution u. Let Z be a 
(—1, 1)? valued random variable with distribution 7r1.., and be independent of X. 
Define X ^9 by letting X (s) = min(X(s), Z(s)) for every s € S, and let 
yu) denote the distribution of X^. In a similar way, define X9 by let- 
ting X (5) (s) = max(X (s), Z(s)) for every s € S, where Z has distribution zzz 
and is independent of X. Denote the distribution of X 9 by 4^9, 


DYNAMICAL STABILITY FOR IPS 545 


DEFINITION 1.10. Let (21, 142) bea pair of probability measures on {—1, 1}°, 
where S is a countable set. Assume that 


Hi pa. 


If 


px us ; 


. then we say that this pair of probability measures is downward ¢-movable. If the 
pair is downward -movable for some ¢ > 0, we say that the pair is downward 
movable. Analogously, if 


a Xu, 
then we say that the pair (5, 42) is upward e-movable and that it is upward mov- 
able if the pair is upward e-movable for some € > 0. 


For probability measures on (0, 1}°, we have identical definitions. 

The relevance of downward (or upward) ¢-movability to our dynamical per- 
colation analysis will be explained in Section 5. In Section 3 we will prove 
é-movability for general monotone systems, which will eventually lead to a proof 
of Theorem 1.1 (and its generalization). We now state a similar and key result for 
the contact process. 


THEOREM 1.11. Let G be a graph of bounded degree, 0 < A1 < 42 and pj, 
Hi, be the upper invariant measures for the contact process on 10, 1}° with para- 
meters X1 and A», respectively. Then (j), Hia) is downward movable. 


We finally mention how the above questions that we study fall into the context 
of classical Markov process theory. Let ($2, F , P) be the probability space where a 
stationary Markov process {X;}r>0 taking values in some state space 4 is defined. 
Letting u denote the distribution of X, (for any t), consider an event AC 4 with 
pA (GA) = 1. Let A; be the event that A occurs at time t. We say that Æ is a dy- 
namically stable event if P(A; Vt > 0) = 1. In Markov process terminology, this 
is equivalent to saying that A‘° has capacity zero. All the questions in this paper 
deal with showing, for various models and parameters, that the event that there 
exists/there does not exist an infinite connected component of sites which are all 
open is dynamically stable. 

The rest of this paper is divided into 9 sections. In Section 2 we will give all 
necessary preliminaries and precise definitions of our models. Sections 3 and 4 
will deal with the concept of e-movability. In Section 3 we develop what will be 
needed to prove Theorem 1.1 and its generalization. In Section 4 we will prove 
Theorem 1.11 (which is the key to Theorem 1.9), as well as give a proof that 
Àp < oo for trees. In Section 5 we prove two elementary lemmas which relate the 


546 E. I. BROMAN AND J. E. STEIF 


notion of ¢-movability to dynamical questions. In the remaining sections proofs of 
the remaining results are given. We note that the proof of Theorem 1.4 will use the 
proof of Theorem 1.5 and, hence, will come afterward. 

We end with one bit of notation. If jz is a probability measure on some set U, 
we write X ~ u to mean that X is a random variable taking values in U with 
distribution u. 


2. Models and definitions. Before presenting the interacting particle systems 
discussed in this paper, we will present some definitions and results related to sto- 
chastic domination. Let $ be any countable set. For o,o’ € {—1, 1}°, we write 
o <o’ if o(s) < o'(s) for every s € S. An increasing function f is a function 
f:(-1, 1 — R such that f (o) < f (o^) for all o < ø’. For two probability mea- 
sures 44, u’ on (—1, 1)?, we write u < y’ if, for every continuous increasing func- 
tion f, we have that u(f) < u'(f). [4 Cf) is shorthand for f f (x) du(x).] When 
(—1, 1)? is replaced by (0, 1)?, we have identical definitions. Strassen’s theorem 
(see [19], page 72) states that if u < w’, then there exist random variables X, X’ 
with distribution 44, u’, respectively, such that X < X’ a.s. 

A very useful result is the so-called Holley's inequality, which appeared first 
in [18]. We will present a variant of the theorem by Holley; it is not the most 
general, but is sufficient for our purposes. 


THEOREM 2.1. Take S to be a finite set. Let u, u’ be probability measures 
on {—1, 1)? which assign positive probability to all configurations o € (—1, 1}°. 
Assume that 


u(o(s) = 11e (SVs) 2 £) x u (o(s) = 1o (S Ns) =£ 
for every s € S and E < &', where €, &' € (—1,1)? V. Then u 3 w. 


PROOF. See [9] or [13] fora proof. O 


Two properties of probability measures which are often encountered within the 
field of interacting particle systems are the monotonicity property and the property 
of positive correlations presented below. 


DEFINITION 2.2. Take $ to be a finite set. A probability measure u on 
{—1,1}5 which assigns positive probability to every o € (—1,1]? is called 
monotone if, for every s € S and £ < &’ where £, &’ e (—1, 1]? V, 


ulo (s) = Mo (SN s) =&) x p(o (s) = lo(S Vs) = £). 


We point out immediately that it is known that this is equivalent to the so-called 
FKG lattice condition. 


DYNAMICAL STABILITY FOR IPS 547 


DEFINITION 2.3. A probability measure jz on (—1, 1]? is said to have positive 
correlations if, for all bounded increasing functions f, g : [—1, 15 — R, we have 


uA fg) = uCf)ucg). 


The following important result is sometimes known as the FKG inequality 


(see [7]). 


THEOREM 2.4. Take S to be a finite set. Let p be a monotone probability 
measure on {—1,1}° which assigns positive probability to every configuration. 
Then u has positive correlations. 


PROOF. This was originally proved in [7]; see also [9] fora proof. |! 


In this section, and also later in this paper, we will talk about convergence of 
probability measures. Convergence will always mean weak convergence, where 
(0, 1)? is given the product topology. 


2.1. The Ising model. Take G = (S, E), where |S| « co. The Ising mea- 
sure P^ on (—1, 1)? at inverse temperature f > 0, external field h and with free 
boundary conditions is defined as follows. For any configuration o € (—1, 1}°, let 


(4) HP^(g)—- —8 X o(t)o(t') - hy olt). 
(t.i )e E tes 
t'es 


H®: is called the Hamiltonian. Define 4:" by assigning the probability 


—HP^(g) 
e 
(5) uê” (o) = > 


to any configuration o € (—1, 1}°, where Z is a normalization constant. Of course, 
Z depends on the graph and the values B and h, but this will not be important for 
us and, therefore, not reflected in the notation. 

Take Sn :— Any, = (—n — 1,...,n + 1}% and E, to be the set of all near- 
est neighbor pairs of S,. Given a configuration & on {—1, 1)Z Vu, let, for 
c € (-1, 1)^v, 


(0  H;^'(o)-—B M! e(Dc(t)—-h Y, o(0—-B OY okt) 


{t,t ek, ted, (t, )e E, 
t€ A, t€ An 
VEAn+1 VÀn 


be our Hamiltonian. Here & is called a boundary condition. Again, we define a 
probability measure using (5), but using the Hamiltonian of (6) instead. This Ising 
measure will be denoted by pee ^ The cases E = 1 and £ = —1 are especially im- 


portant and the corresponding Ising measures are denoted by js P^ and Hn oe 


548 E. I. BROMAN AND J. E. STEIF 


respectively. We view pi^ d (resp. us? ns as a probability measure on (—1, 1}2" 
by letting, with probability 1, the configuration be identically 1 (resp. —1) out- 
side A,. It is known (see [19], page 189) that the sequences (uz B h and {un P zt 
converge as n tends to infinity; these limits are denoted by j^ and P^. 

The same kind of construction can be carried out on any infinite connected lo- 
cally finite graph G = (5S, E). One defines a Hamiltonian analogous to the one 
in (6), but with A, replaced by any A C S where |A| < oo. With € = 1 or & = —1, 
one then considers the corresponding limits of Ising measures as A ^ S, the limit 
turning out to be independent of the particular choice of sequence. See, for in- 
stance, [9] for how this is carried out in detail. Fix A = 0 and abbreviate w+? 
and 9 by wt? and w~*. It is well known [8, 9] that, for any graph, there 
exists B. € [0, oo] such that, for 0 < B < e, we have that u™Ê = uw (and there 
is then a unique so-called Gibbs state) and for B > Be, u^? # uw ^P. For Z with 
d > 2, and many other graphs, B. € (0, oo). Be is sometimes referred to as the 
critical inverse temperature for phase transition in the Ising model. Furthermore, 
in [14] the author shows that if G is of bounded degree, the condition Be < oo is 
equivalent to the condition pe < 1, where pe is the critical parameter value for site 
percolation on G. It is easy to see that for any graph of bounded degree pe > 0 (see 
the proof of Theorem 1.10 of [10]). This, in turn, implies, via the connection be- 
tween the random cluster model and the Ising model described below, that 6, > 0 
for any graph of bounded degree. 


2.2. Spin systems. A configuration o € (—1, 1]? can be seen as particles on 

a discrete set S having one of two different "spins" represented by —1 and 1. To 
this we will add a stochastic dynamics, and assume that the system is described 
by “flip rate intensities,” which we will denote by (C(s, 0)],es, ce{—1,1}5- CS, 0) 
represents the rate at which site s changes its state when the present configuration 
is c. Of course, C(s,0) > 0 Vs € S,o e {—1, 1}°, and we assume that the inter- 
action is nearest neighbor in the sense that the flip rate of a site s € S only depends 
on the configuration o at s and at sites t with (s, t} € E. We will limit ourselves 
to only allow one site flip in every transition and we will only consider flip rate 
intensities such that 

sup C(s, o) < oo. 

S,0 
In many cases we will consider translation invariant systems and then this last con- 
dition will hold trivially. Furthermore, we will always assume the trivial condition 
that, for every s € S, 


sup C(s,o(s)) » 0, sup Cí(s,o(s)) » 0. 


co :0(5)—0 o:o(s)=1 


We will call such an object a spin system (see [6] or [19] for results concern- 
ing general spin systems). Given such rates, one can obtain a Markov process Y 


DYNAMICAL STABILITY FOR IPS 549 


on [—1, 1}5 governed by these flip rates; see [19]. Such a Markov process with a 
specified initial distribution u on (—1, 1]? will be denoted by V^. Given a Markov 
process, u will be called an invariant distribution for the process if the projections 
of W^ onto (—1, 1]? at any fixed time t > 0 is m. In this case, V^ will be a sta- 
tionary Markov process on (—1, 1]^, all of whose marginal distributions are p. 
Of course, the state space (—1, 1]? can be exchanged for either (0, 1]? or (0, 1}*. 

Sometimes we will work with two different sets of flip rates, 
{CiS, O)}ses cet—1,1}5 and {C25, 0)] ses c e(—1,1)5. governing two Markov proc- 
esses V4 and W2, respectively. We will write C1 < C» if the following conditions 
are satisfied: 


(7) C^(s, 05) > C1(s,01) Ys ES, Vo, Xo» s.t. o1(s) —o»2(s) — 0, 

and 

(8) Ci(s, 01) > Co(s, 02) Ys ES, Vo; <o s.t.o1(s) = 02(5) = 1. 

The point of C; < C? is that a coupling of Yı and Y2 will then exist for which 
{(n, 6): n(s) x 6(s) Vs € S] is invariant for the process; see [19]. 


2.3. Stochastic Ising models. We will now briefly discuss stochastic Ising 
models. We will omit most details; for an extensive discussion and analysis, see 
again [19]. Consider Ga = (Sn, En), defined in Section 2.1. Given 6 and hA, it is 


possible to construct flip rates C on (—1, 1}** for which TL 
poe 


is reversible and 
invariant. We denote by the corresponding stationary Markov process with 
initial distribution ui ^ One possible choice of flip rate intensities are that, for 
every s € A, and ø € (—1, 1}°, | 


C(s, o) 
— exp |- ), c(0e()- » se) — hec) 
t€ Ag: (ts) e E, teAns1\An: (t,s) € E, 


Sites in A,+41 \ A, are kept fixed at 1. Observe that if s € A,—1, the second sum 
is over an empty set. A straightforward calculation gives 


(9) C; (s, oju P (o) = C s, osu P” (a5), 
where 
— fo), ifs 
0) = | coh. Zee 


This shows that indeed yu?" is reversible and invariant for C+. Any family of 
spin rates satisfying (9) 1s called a stochastic Ising model (on our finite set). One 


550 E. I. BROMAN AND J. E. STEIF 


can show that there exists a limiting distribution Yt?” of V. ‘Ph when n tends 
to infinity; see [19], Theorem 2.2, page 17 and Theorem 2.7, page 139. Further- 
more, V ^^^ is a stationary Markov process on (—1, 1}2" with marginal distribu- 
tion wt?” governed by flip rate intensities 


(10) C(s, 0) — exp (-^ y o (t)a (s) — ho o) 


teZd : {t,s}eE 


see [19], Theorem 2.7, page 139. It is also possible to construct Yt?” directly 


do. ; uw 
on (—1,1)7 without going through the limiting procedure. Furthermore, there 
are several possible choices of flip rate intensities that can be used to construct 


a stationary and reversible Markov process on {—1, nz with marginal distribu- 
tion zt?" In [19], a stochastic Ising model is defined to be any Spin system with 


flip rate intensities (C (s, 0 )) eza ce{—1,1)24 satisfying that, for each s € Z4, 


(11) C(s,c)exp ( ` o (t)o (s) + ho e) 
(t,s)eE 
t eZ 

is independent of o (s). Therefore, when we refer to a stochastic Ising model 
wt #4 with marginal distribution 1, P^, we will have this definition in mind. 
It is particularly easy to see that (11) (or the condition of detailed balance as it 
is often referred to) is satisfied for the flip rate intensities of (10), but there are 
many other rates satisfying this. It is known that the set of so-called Gibbs states 
are exactly the same as the class of reversible measures with respect to the flip 
rates satisfying (11); see [19], pages 190—196. Note also that the condition speci- 
fied in (11) with Z7 replaced by A, is equivalent to that of (9) (modified with the 
boundary condition removed). 


While we defined above stochastic Ising models on {—1, nz. this construction 
can be done on more general graphs (see [19]). 


2.4. The random cluster model. Unlike all other models in this paper, the ran- 
dom cluster model deals with configurations on the edges E of a graph G = (S, E). 
We will review the definition of the regular random cluster measure on general fi- 
nite graphs and the “wired” random cluster measure on A, © Z4. We will also 
recall the limiting measures and in the next subsection the connection between the 
random cluster model and the Ising model. In doing so we will follow the outlines 
of [9] and [13] closely. 

Take a finite graph G = (S, E). Define the random cluster measure pp on 
(0, 1}£ with parameters p € [0, 1] and q > 0 as the probability measure which 
assigns to the configuration 7 € (0, 1)^ the probability 


k 
(12) gto) = I pra pO. 
T Z 


ecE 


DYNAMICAL STABILITY FOR IPS 551 


Here Z is again a normalization constant and k(7) is the number of connected 
components of 7. From now on we will always take q — 2 and, therefore, we will 
suppress q in the notation. 

Take Gn = (Sn, En), where Sn = Any € Z and E, is the set of all nearest 
neighbor pairs of A441. Write vA for VG and define 


VP (-) = v? (jall edges of E, with both 
(13) 


end sites in Ay+1 V A, are present). 


This is the so-called “wired” random cluster measure. It is called “wired” since 
all edges of the boundary are present. It is immediate from the defining equations 
(12) and (13), that, for e € E, and any £ e (0, 1)77 v, 


VP (n(e) = 1in(En Ve) =£) 


p, if the endpoints of e are connected in £, 


(14) 


T, otherwise. 
2—p 
One can show (see [9] or [13]) that when n tends to infinity, the probability mea- 
sures (77 },,<en+ converge to a probability measure v7. Furthermore, the construc- 
tion of DF on (0, 1}£* can be done on any finite subgraph by connecting all sites of 
the boundary of the graph with each other. As a consequence, we can also define 
random cluster measures on more general graphs than Zt: see, for example, [11]. 


2.5. The random cluster model and the Ising model. ‘Take Gn = (Sn, En) as in 
Section 2.4. As in [13], let PẸ be the probability measure on (—1, 1}5" x (0, 1)7" 
defined in the following way: 


1. Assign each site of A,+1 V A, and every edge with both endpoints in An+1 \ An 
the value 1. 

2. Assign each site of A, the value 1 or —1 with equal probability, assign each 
edge with not more than one endpoint in An+1 V A, the value O or 1 with 
probabilities 1 — p and p, respectively. Do this independently for all sites and 
edges. 

3. Condition on the event that no two sites with different spins have an open edge 
connecting them. 


One can then check that P7 (o, (0, 1}£*) = ut’? (a) with B = —log(1 — p)/2, and 
that PE ((—1, 15», n) = 97 (n). Here, P7 (o, (0, 1)7") is just the marginal in the 
first coordinate of P7. The same kind of construction can be carried out on any 
finite graph G — (S, E). 


552 E. I. BROMAN AND J. E. STEIF 


2.6. The contact process. Consider a graph G = (S, E) of bounded degree. In 
the contact process the state space is (0, 1}5. Let X > 0, and define the flip rate 
intensities to be 


l, if o (s) — 1, 
C(s,0) —14À b9 o(s”, ifo(s)=0. 
(s',S)EE 


If we let the initial distribution be o = 1, the distribution of this process at time f, 
which we will denote by 5; T; (f), is known to converge as ¢ tends to infinity. This is 
simply because it is a so-called "attractive" process and o = 1 is the maximal state 
and {ôT} (1)] is stochastically decreasing; see [19], page 265. This limiting dis- 
tribution will be referred to as the upper invariant measure for the contact process 
with parameter A and will be denoted by u). We then let W^ denote the stationary 
Markov process on (0, 1]? with initial (and invariant) distribution p3. 


3. e-movability for monotone measures. In this section we prove movability 
results for classes of monotone measures. The finite case is covered by Lemma 3.3, 
while the countable case is discussed in Proposition 3.4. In this section we will 
always assume that our measures have full support. 

For any [S| < oo, s € S, £ € (0, 1] V and probability measure u on (0, 1}5, 
write 49 GE) for pw (o (s) = ijo (S\s) = £), n ^9 Gin E) for pw” ((o(s) = 
i} N{o(S \s) —£)) and wp (E) for uw") (o (S \ s) = £). Here “+” can represent 
either + or — and i € (0, 1). Note that s is suppressed in the notation and so should 
be understood from context. 

We begin with an easy lemma whose proof 1s left to the reader. The idea is that 
if the configuration outside of s is £ under u ^9, it must have been at least as 
large under u “before flipping some 1’s to 0's"; then use monotonicity. 


LEMMA 3.1. Assume that u is a monotone probability measure on (0, 1}°, 
where |S| < oo. Take s € S and let £ € (0, 1)? V. Then, for any £ > 0, we have that 


u C? 048) > 0 — eju E) 
and that 
pt) Olé) > (1 — e)a (0/8). 


The next lemma will be used to prove Lemma 3.3. 


LEMMA 3.2. Assume that u is a monotone probability measure on (0, 1}°, 
where |S| < co. For any € > 0, p ^9 is also monotone. 


PROOF. Let s € S be arbitrary, X ~ u and XC^9 ~ yw), For any 8,5 € 
(0, 15V, define the probability measures ws and puy on [0,1]? by letting 


DYNAMICAL STABILITY FOR IPS 553 
Is (A) = P(X € AIXC^9 (8 V s) = 8) and ua (4) = P(X € AIXC^9 (S \ 5) =n) 
for every event A in (0, 1]? 9, respectively. We will prove that 
(15) Hê Suy — Vócm. 


This will give us [since P(X (s) = 1|X (S V s) = n) is an increasing function of 7] 
that 


P(X) (s) = 11x 9 (S Vs) =n) 


-ü-9[ s POCO = MXC V =f) din 


- (0 — 5) I aos PULO = 1X9) =f) ds 


= P(X (s) = 11K) (SV 5) = ô). 
Since s was chosen arbitrarily, this would prove the statement. 
We now prove (15). Define for n < n d(n, n) = (te S\s:n@ =1}|-lf{t e SN 
s:n(t) = 1}| and d(y, 0) = Ht € S\s:n(t) = 1}|. Here |- | denotes cardinality. Let 
usys (n) = P(X(S \ s) = n) and define uS, similarly. We have that, for n < ñ, 


Gj H > (7) 
(16) un) = P(X AS s) = 01 X(S\8) = ij) 87 
Iss (7) 
(17) — gd Gm (y — gyd i0) Pss D) 
ee ` 
Iss n) 


It is well known that u being monotone implies that, for every, ô, 7j, 


(18) Iss (ñ V.) s) A s) = us s Gus s). 


By a simple modification of Theorem 2.9, page 75 of [19], 1t is enough for us to 
show that 


(19) Ix Gi V Jusl ^) > us Gi) ua (6) 
for all 5j > 7 and à > à to show (15). An elementary calculation shows that 
(20) d(ij v 8, n) +a (ñ A 8,8) = d (ñ, n) - 4 (8,9). 
We therefore get 
Ix Gi V us (f ^ 8) 
—— AGED GA eq dG0-4d(5,0 IS C V 8) s Gi ^ 8) 
=£ (1— &) 
S0 iQ 
Hss MD Has 
> gd Gi -46,9) (1... g)d(0,0)+d6,0) usis) | ess (B) 


— — = Ia (ñ) Lbs (8), 
uc (n) TE (8) 


554 E. IL. BROMAN AND J. E. STEIF 


where (16) is used in the first and last equality and equations (18) and (20) are used 
in the inequality. C] 


LEMMA 3.3. Let pı, u2 be probability measures on (0, 1)? , where |S| < oo. 
Assume that [42 is monotone and that 


A:= inf [uz2(0 (s) = 1o (S \ s) =£) — ui (o (s) = 1o (SN s) & £)] > O. 
£c(0,1])5 W 
Then for any choice of ¢ > 0, such that 
1 
A>-—-—l, 
7 l—é 
we have 


7 x unt 


Hence, (u1, {42) is downward movable. 


PROOF. Monotonicity of 42, Lemma 3.1, the definition of A and our choice 
of ¢ give us that, for any s € S and £ € (0, 1)? V, 


ws (WE) > (0. — )u2(1I£) = (1 — eY(A + 1a (HE) 


I 
>(1- oat = ui E). 


By Lemma 3.2, n is monotone and so Vé < £, 


iil < u$ (1) < u$" 11b. 
The proof is completed by the use of Holley's inequality, Theorem 2.1. C 


PROPOSITION 3.4. Let S be any finite or countable set and consider (Sy) nen+; 
a collection of sets such that |S$4| < oo Vn € Nt and S, 4 S. Let (Hi n)neN+, 
(H2. n)neN* , be two collections of probability measures, where Hin, u2,, are prob- 
ability measures on (0, 1}* for every n e NY. Furthermore, assume that all of the 
probability measures (Mi n)nen+ ((42,n) nent) are monotone, that Hi, n — u1 and 
that U2 n — pu». Set 

An:= inf [u25(cG)-— llo (SN s) = £) — w1n(o(s) = 1o (S X5) & £)]. 


$€Spn 
£c(0,1]52 M 


If 
inf A, > 0, 


neNt 


then (H1, u2) is both upward and downward movable. 


DYNAMICAL STABILITY FOR IPS 555 


PROOF. ‘Take e > Q such that 


With this choice of e, Lemma 3.3 says that (441.n, 42,n) is upward (downward) 

. x (—,€) (—,£) 
e-movable. Since 4,5 — H1 and H2,n — H2, we easily get that Bn > My 
and Dos — TRA Furthermore, since the relations 


Bis 


and 


no < H2.n 


are easily seen to be preserved under weak limits, we get that 


(—,€) 


Bi xus tmd) 


and ui < H2. O 


4. e-movability for the contact process and a 0-1 law. The conditions in 
our next proposition might seem overly technical; however, these represent the 
essentia] features of the contact process (after a small suitable time rescaling) and, 
therefore, we feel it is instructive to highlight these features. In Proposition 4.1 
and Lemmas 5.1, 5.2 and 8.1 we will use the so-called graphical representation to 
define our processes; see, for instance, [19], page 172. 


PROPOSITION 4.1. Let uı and u3 be two probability measures defined 
on (0, 1}°, where S is a countable set. Assume that ui < u2 and that there ex- 
ists two stationary Markov processes VW; and V2, governed by flip rate intensities 
(C1(5,01))ses,o1et,1,5 and (Co(s, 02)),65,o,e(0,1)5» respectively, and with mar- 
ginal distributions u1 and u2. Assume that Cı < C2 [conditions (T) and (8) of 
the Introduction]. Consider the following conditions: 


1. There exists an 8&1 > O such that 
C2(s, 02) — C1(s, 01) = €1 
(21) 
Vs eS, Von > oj s.t. o2(s) = 0 and C1(s,01) £0. 
2. There exists an £5 > 0 such that 
C1(s,01) — Co(s, 02) = €2 
(22) 
Vs ES, Yor > oj s.t. o1(s) = 1 and Cz(s, 02) £0. 


3. There exists an €3 > 0 such that 


(23) C1 (5,01) > & Ys ES, Voy s.t.o1(s) — 1. 


556 E. I. BROMAN AND J. E. STEIF 


4. There exists an £4 > Q such that 
(24) C2 (s, 02) > &4 Ys ES, Voz s.t. o2(s) =0. 


If conditions 1, 2 and 3 are satisfied, then (11, u2) is downward movable. 
If conditions 1, 2 and 4 are satisfied, then (11, u2) is upward movable. 


PROOF. We will prove the first statement, the second follows by symmetry. 
Define 


a sup C2(s, 02) + sup C1 (5, 07). 
5,02 : 02(s)—0 5,01 : 01 (s)—1 

Our aim is to construct a coupling of the processes {X1,t}r>0 ~ V1 and {X2 t}r>0 ~ 
V^ such that X4; < X5; Vt > 0 in such a way that we prove the proposition. 
Before presenting the actual coupling, we will discuss the idea behind it. For every 
site s € S, associate an independent Poisson process with parameter A. Next, let 
{Us k}seS,k>1 and (U; k}łses,k>1 be independent uniform [0, 1] random variables 
also independent of the Poisson processes. If t 1s an arrival time for the Poisson 
process at site s, we write Us z for Us k, where k is such that t is the kth arrival of 
the Poisson process at site s. Now, let r be an arrival time for the Poisson process 
associated to a site s. For i € (1, 2}, let X; ,- and X; z+ denote the configurations 
before and after the arrival. We will let the outcome of U; ; decide what happens 
with the {X2 t}r>0 process at time t = c, and then we will let Uus together with 
U; ., decide what happens with the {X1,¢}:>0 process at time t = t. As we will 
see, we will do this so that X1,; < X2, for all t > 0. Furthermore, we will do 
this in such a way that there exists an e € (0, 1) such that if U; , > 1 — €, then 
X1 c (5) = 0 regardless of the outcome of U; ;. Consider now the process [X7 }r>0 
we get by taking X5(s) = 1 for every s € 5 and letting {X7 (s)}r>0 be updated at 
every arrival time t for the Poisson process associated to s, and updated in such a 
way that X$, (s) =0 if U;, > 1 — £, and X7,(s) = 1 if U; , < 1 — e. Of course, 
the distribution of X? will converge to 7r; «. Observe that whenever X7 (s) = 0, 
we have that X; t(s) = 0. Therefore, we can conclude that 


(25) X1,  min(Xo i, X1) Vt — 0. 


Furthermore, since the process {X;}:>0 does not depend on any Us,,, we have 
that X? (s) is conditionally independent of X» , if there has been an arrival for the 
Poisson process associated to s before time t. Let sj, i € (1, ..., n), be distinct sites 
in S and let A; be the event that all Poisson processes associated to sı through Sn 
have had an arrival by time t. Of course, IP(.4;) = (1 — e ^^)" and so we get that 
P(X2, XE (s1) = +++ = X2, Xf (sn) = 1) 
E P(X». X7 (51) E X21 X; (Sn) = LLA PCA) 


+ P(X2, XF (61) = ++ = X2, XF (Sn) = 1AF )P(AF) 


DYNAMICAL STABILITY FORIPS ` 557 


= P(X2 (81) = +++ = X2 (sn) = LAr) 
x P(X?(s1) = +++ = XF (Sn) = 1 At) P(A) 
+P(X2, XF (81) = +++ = Xo, Xp (Sn) = LAP) PCA?) 
—P(X2, (51) = +++ = X2, (Sn) = A) P(A) — 8)” 
+ P(X2, Xf (51) =- = Xo, XF (Sn) = Mo; )PCA;) 
= P((X2, (51) = +++ = X2,4(Sn) = 1) Ad) — £)” 
+ P(X, XF (51) = +++ = Xo4 X; (Sn) = 1T )PCA;) 
> (P(X2:(s1) = +++ = X2,2(sn) = 1) — PCAP) CG — £)” 
+ P(X2,:XF (81) = +++ = X2, Xp (Sn) = 1A; PAP) 
= P(X2,1(51) =- = X22(sn) = 1) — &)" 
+ IPCAS) (P(X2 , XE (51) = ++ = Xo, XE (sn) = LAS) — (0 — &)") 
= u$ P (e(s = =0(sp) = I) 
+ P(A) (P(X2, X7 (51) = +++ = Xo, Xf (Sn) = LAF) — (1 — €)”) 
SP pS? (a (si) = ++ o (89) = 1). 


In addition, 


IPP(X» (si) mem s)-z-l Ar) — e)" 
x P(X2 (51) 9 = X24) = 1)0 — £)” 
= us (o(s1) =.= 0 (Sn) =I). 


Hence, by inclusion exclusion, we have that the distribution of 
min(X» ;, X7?) approaches ps? as t tends to infinity. So by first taking the limit 
in (25), we get that jj < uS "^^, as desired. 

Now to the construction. Take X10 ^ 141, X2,0 ~ Ha, such that X19 < X2,9. Let 


t be an arrival time for the Poisson process associated to s. Take Us, and U, ,. 
The following transition rules apply: 


X21- Xart if 
C^o(s, X^ .— 
0 1 Us. < 2(s - 2. ) 
À — Ca (S, X> .- 
1 0 U, . > 2(s 2,0 ) 


À 

It is easy to check that the process {X2 t}r>0 thus constructed will have the right 
flip-rate intensities. The construction of {X1,1};>0 is slightly more complicated. If 
Co(s, X2 ,-) = 0 and X» ,- (s) = 0, then it follows from (7) that Cy (s, X, .-) — 0, 


558 E. I. BROMAN AND J. E. STEIF 


; ; Ci(s, X, ,.— 
and in that case we interpret a m as 0. Observe that Co(s, X2 ,-) can be 0 


when X» z- (s) = 1, but it will not cause any problems. With these observations in 
mind, these are the transition rules we apply: 





(XirXor-) (Xie X2,r+) "A if » | 
205, AQ c- 15, X4 T> 
0,0 1,1 D. Eo andi! Se 
m is: dida À E C2(s, X» ,-) 
C»(s, X5 ,- _ 
(0, 0) (0, 1) Us 7 < e ted and U. 1 > Cis, X1) 
À i C»(s, X2 x-) 
(0, 0) (0, 0) otherwise 
À — C» (s, X5 +- 
(0, 1) (0, 0) Use = a) 


SUP o, : 02(s)=0 C2(S, 02) 


(0, 1) (1, 1) Us c <= n 
S,To— SUD; o, : o2(s)=0 Crs, 02) 
(0, 1) (0, 1) otherwise 
à — C5(s, X5 .- 
(1,1) (0, 0) Us > — 
à — C»5(s, X» ,- 

(1, 1) (0, 1) Usiz — Ane) and 

U' > À. E Ci (s, Xir) 


ES Car(s, X21-) 
(1, 1) (1, 1) otherwise. 


It is not difficult to check that all flip rate intensities are correct and that X; < X2. 


for all t > 0. Observe that, by the definition of A, the events (Uy, > ^ 25:522] 
SUD; 5, : o» (s)=0 C2(5,02) TM 
and (U;,; < ——772—^71— — ——) are disjoint when (X, ,-, X2? ,-) = (0, 1). 


We now want to show that there exists an £ > 0 so that U; , > 1 — e im- 
plies that X4 , (s) — 0. Note that if (X1 ,-, X2 ,-) = (0,0) and Ci1(s, X1,-) > 0 
[=> Co(s, X92 c-) > 0], then 


Cils, X1) | C2, X21) — 1 | £] 


Co(s,Xa,-) © C6, Xa) TO SUD; 6: o2(s)=0 C2 (S, 02) p^ 
and if (X, ,-, X2 ,-) = (0, 0) and C; (s, X,,,-) — 0, then 
Ci(s, X1c-) — Q. 
C2 (s, Xa2,r-) 
Furthermore, if (X, ,-, X2 ,-) = (0, 1) and C1 (s, X, r-) > 0, then 
Ci(s, Xi r-) , £1 


SR Ede e ee 
SUD; o, : o2(s)=0 C»(s, 02) SUD; 6 : o» (s)—-0 C2(s, 02) 


DYNAMICAL STABILITY FOR IPS 559 


while again if (X1,-, X2,r-) = (0, 1) and C1(s, X; ,-) = 0, then the 0 never 
changes to a 1. Finally if (X;j,-,X2,-) = (1,1) and Co(s, X5,-) > 0 
[=> Ci(s, X1,,-) > 0], then 

Ac C; (s, Xix-) < aes C»(s, X2 1-) — €2 

À — Car(s, X24.1-) — A~ Cos, X» -) 


« ] NEM ENS 
A C»(s, X» c-) 
< ji mm dd 
À 
and if (Xit X4 r-) = (1, 1) and C2 (S, X2 r-) = 0, 
ROUSE Ve Fp ey 
à — Co(s, X2,7-) À À 
Therefore, whenever 
£1 £2 £3 
U! > max(1 - ————“!_____ 1-2, 1-3), 
s SUD; c, : o, (5)-0 C25, 02) À À 


we have that X;,+(s) = O regardless of the outcome of U;;. Therefore, 
(441, 42) is downward e-movable where 


€] £2 £3 
Ec | - max(1 — ) 


SUD; a : a»(s)-0 C^(s, 02) i À À 


= min( d £2 =) 
SUPs o: 09(s)-=0 C205; on) A AS 0 


PROOF OF THEOREM 1.11. Take ô > 0 such that A1(1 +ô) < A» and consider 
the process {X;};>o constructed in the following way. Take Xo = 1 and let the 
process evolve with flip rate intensities 


1 4-6, if a (s) — 1, 

(26) Cils, o) = 1A1(04-8) Do ol), — ifo(s)—0. 
si~s 

Denote the limiting distribution of X; as ¢ tends to infinity by 445,1, 0-5- It 
is easy to see that this process is just a time-scaling of the contact process con- 
structed in Section 2.6 with parameter A;. Recall that that process had limiting 
distribution 444;, the upper invariant measure for the contact process. Thus, we 
have Ui; = 41+8,a,(148)- By Proposition 4.1 with C; as above and C» as in Sec- 
tion 2.6 with parameter A», there exists an & > 0 such that 


Mis Aidt) X px 


Hence, (Hij, Haa) is downward movable. O 


560 E. I. BROMAN AND J. E. STEIF 


For the rest of this section we will only consider the graph T4 for d > 2. The 
following is a 0-1 law for the upper invariant measure for the contact process. 


PROPOSITION 4.2. Let AC (0, iy, where d > 2, be a set which is invariant 
under all graph automorphisms on TË. Then, for 4 > 0, we have that 


H(A) € {0, 1}. 
PROOF. Let ¢ > 0. By elementary measure theory, there exists a cylinder 
event B depending on finitely many coordinates such that 
(27) I GA AB) <E. 


Let supp B denote the finite number of coordinates with respect to which B 
is measurable. Letting {T} (£));-o denote the Markov semigroup for the contact 
process with parameter A, we have that ô T} (t) — m and also that ua < 61 T3 (t) 
for every t > 0. Choose ¢ so that, for all (equivalently, some) sites s, 


81T, (0 (n(s) = 1) < ui (n(s) = 1) + ue 


It follows easily that if m is any coupling of ô; T}, (f) and u, which is concentrated 
on {(7, 5): < ô}, then, for any finite set S of sites, 


S 
m((n, 8) : n(s) Æ 8(s) occurs for some s € $) < Fur 


In particular, if E is any event depending on at most 2] supp B| sites, then 
Q8) |y. (E) — ux (E) Se. 


For this fixed t, Theorem 4.6, page 35 of [19] shows that there exists an auto- 
morphism y € AUT(T4) such that 


(29) [81 T, (0) (B Y y B) — 817, 0) (8)81T 1 (0 (y 8)] <E. 
Furthermore, since p is invariant under automorphisms, (27) implies that 

Hy Ay B) <E, 
and since A= yr, we have 

Hx GA A y B) Se. 
It follows that 
AX GB Ay B) < ux GA Ay B) + ia (8A) < 28. 
Next, (28) implies that 
517, (£0) B Ay B) — n4 (B Ay B)| <8, 


DYNAMICAL STABILITY FOR IPS 561 


and so 
(30) 01T4 (0) (8 Ay B) x 3e. 
We get that 


Iual A) — ua G^] = 1n GA) — ua GA) ua A)| 
< |ua (8) — ui (8)ua Cy B)| + 36 
< [EAR (£0)(88) — 61T4 (0 (8)81 T4 (0) (y B) + 6e 
< 54 (0(8) — 817 (0(8 N yB) + 7e 
< ôT (0 (8 Ay B) + 7e < 10e, 


where we used (27), (28) and (29) for the three first inequalities and (30) in the 
last. Since € > 0 was choosen arbitrarily, we get that 


Ba GA) = ua GAY 
and so u (A) € (0,1). O 


REMARKS. The above proof works for any transitive and even quasi-transitive 
graph. For the case of ZZ, this was proved in Proposition 2.16, page 143 of [19]. 
It is mentioned there that, while ô1 T} (t) is ergodic for each £, one cannot con- 
clude immediately the ergodicity of u, because the class of ergodic processes is 
not weakly closed. We point out, however, that there is another important notion 
of convergence given by the d-metric (see [24], page 89 for definition) on station- 
ary processes. Convergence in this metric is stronger than weak convergence and 
weaker than convergence in the total variation norm. It is also known that the er- 
godic processes are d-closed and that weak convergence together with stochastic 
ordering implies d-convergence. In this way, one can conclude ergodicity of p 
using the d-metric, giving an alternative proof of Proposition 2.16 of [19]. In fact, 
the proof of Proposition 4.2 is essentially based on this idea. However, because of 
the open question listed below, it is not so easy to formulate the d-metric for tree 
indexed processes and so we choose a more hands on approach. Observe that the 
crucial property of d-convergence which is essentially used in the above proof is 
that, for each fixed k, one has uniform convergence of the probability measures 
(in, say, the total variation norm) over all sets which depend on at most k points. 
(The point is that the k points can lie anywhere and, hence, this is much stronger 
than weak convergence.) 


Open question related to defining the d-metric for tree indexed processes. 
Assume that u and v are two automorphism invariant probability measures 
on {0, 17 such that u < v. Does there exist a T¢-invariant coupling (X, Y) with 
X~p,Y ~vand X < Y? 


562 E. I. BROMAN AND J. E. STEIF 


PROPOSITION 4.3. On TI, d > 2, there exists a Xp such that, for all X > Xp, 
ux (87) = 1. 


PROOF. By Theorem 1.33(c), page 275 in [19], for sufficiently large A, 
B5) = 1) > 2/3. By [12], we have that if u(n (s) = 1) > 2/3, then 


p (87) > 0. 
Finally, Proposition 4.2 then implies that 
p (G7) = 1. x 


5. Relationship between £-movability and dynamics. In the general setup 
we have a family of stationary Markov processes parametrized by one or two pa- 
rameters, for example, the contact processes V^ (A is here the only parameter) 
or a stochastic Ising model W^*^^ (8 and h being the parameters). Many of the 
proofs in this paper will involve comparing the marginal distributions of these 
Markov processes for two different values of one of the involved parameters. Let 
p be the parameter and let pj < po. Assume that the marginal distributions are 
Hp; and jLp,, respectively, and that 4p, < Hp,- Lemmas 5.1 and 5.2 show that there 
is a close connection between showing that (445, , Hp) is downward e-movable and 
that the infimum of the second process over a short time interval is stochastically 
larger than the first process. 

Let V^ be a stationary Markov process on (0, 1]? with marginal distribution p 
and let {X;}:>9 ~ w^". For 6 > O and s € S, define 


; sz Inf X;(s), 
Xinf,s (5) is ( (s) 
and denote the distribution of Xing3 by (ings. Similarly, define 


Xsups(5) = sup Xi(s), 
te[0,] 


and denote the distribution of Xsup,5 by Hsup,8- 


LEMMA 5.1. Take S to be the sites of a bounded degree graph. Let 
(C(s, 0) }ses,ce{-1,1)5 be the flip rate intensities for a stationary Markov process 


V^ on (—1, 1)? with marginal distribution u. Let 
À :— sup C(s,o). 
(s,0) 


À 


For any t > 0, if we set e := 1 — e^^*, we have that 


wu) < uat. 


Similarly, we get that 


Lsup,t X pt) . 


DYNAMICAL STABILITY FOR IPS 563 


PROOF. We will prove the first statement, the second statement follows by 
symmetry. Take t > Q. For ay s € S, associate an independent Poisson pieces 
with parameter A. Define ((X], X; 23550 i in the following way. Let X} = = XC ~ h, 
and take ¢’ to be an arrival time fór the Poisson process of a site s. For i € {1, 2}, 
let Xp,- i and X! + denote the Eu before and after the arrival. We let 

Xl.) Æ Xl p,-G) with sages C(s, Xi —)/À and we let X 24) = 0 and fi- 
nally, we let X! pa NS) = X, (SN5), y e NC \s) = x7 - (S X s). Do this inde- 
pendently for all arrival "d or all Poisson processes of all sites. Observe that 
once X?(s) i is 0, it remains so. Note also that X; D gu, Xe ~ u 9, Furthermore, 
if X los 0 for Sonig t € [0, t], the construction guarantees that x2 (s) — 0 and, 
therefore, Xe < x} infr ~ inf, c- O 


LEMMA 5.2. Take S to be the sites of any bounded degree graph. Let 
(C(s, o)}seS,oef—1,1}5 be the flip rate intensities of a stationary Markov process 
V^ on (—1, 1)? with marginal distribution p. Define 

Ay:= inf  C(s,o). 


s,o:a(s)=1 


eee) 


If 41 > 0, then for any 0 < e < 1, if we set t := — , we have that 


HAinf,c pe, 
Similarly, defining d2 :— inf, o :o(s}=0 C(s, 0), if 42 > 0, then for any 0 « & « 1, 


if we set v :— — BED we have that 


(HE) < Hsup, t. 


u 

PROOF. We will prove the first statement, the second statement follows by 
symmetry. For every s € $, associate an independent Poisson process with para- 
meter A :— sup o) C(s, o). Next, let {Us,k}ses,k>1 be independent uniform [0, 1] 
random variables also independent of the Poisson processes. If t’ is an arrival time 
for the Poisson process at site s, we write Us » for Us y, where k is such that t’ is 
the time of the kth arrival of the Poisson process at site s. Define {(X a X2) }>0 
in the following way. Let X] = X2 ~ p, and take t’ to be an arrival time for the 
Poisson process of a site s. We let Xl (s) 35 PON (s) if U, p € C(s, X Eu) /^. 
Furthermore, we let Xol) = = 0 if Usp XA ce or X2. (s) = 0, and finally, we 
let X TG Vs)mX, | (S\s), X 2:608 \s) = X2 _(S V 5). Do this independently 
for all arrival times for all Poisson processes of all sites. ear X. l — u and 
X? ~ u C79. Furthermore, if X2(s) = 0, then either X}(s) = X2(s) = Gor there 
exists at € [O, c] such that ¢ is an arrival time for the Poisson process associated to 
s and Us; € 41/4. Since A; x C(s, xi _) if xx (s) = 1, we get that either Xl, (s) 
or X (s) is O and, therefore, Xl. _ < XL : dg 


infir — 


564 E. L BROMAN AND J. E. STEIF 


To illustrate why the condition A; > 0 of Lemma 5.2 is needed, consider the case 
U — Tp for some p > 0. With £ > 0, if we assume the trivial dynamics C (s, o) — 0 
for all s, c, we will of course not have that uir. < p ^9 for any t >Q. 


6. Proof of Theorem 1.9. Take A > A, and let A' = (A + A,)/2. By The- 
orem 1.11, there exists an € > O such that (ux, 4.) is downward e-movable. 
Lemma 5.1 gives us that there exists a t > O such that T < I inf,r and, hence, 
that ux < Ha,inf,r- Therefore, since C^ is an increasing event and A’ > Àp, we 
have that 


1 = uy (Q^) X unt, (O7) 
and so 
v^(eG* vt [0, c — 1. 


The theorem now follows from countable additivity. LJ 


7. Proof of Theorem 1.1. In this section we will deal with stationary distrib- 
utions for interacting particle systems which are monotone in the sense of Defini- 
tion 2.2. 

Let G = (S, E) be a countable connected locally finite graph and let A C S be 
connected and |A| « co. Let { p^) pez, Where J C R be a family of probability 
measures on (—1, 1}4 such that 


pA x p V pi € pa. 
Assume that there exist stationary Markov processes Y A governed by flip rate in- 
tensities {Cp a ($, 0)),e A 5 e(—1,14 and with marginal distributions p^ . Further- 


more, assume that there exists limiting distributions V? of YẸ and u? of u^ 
as A f S. Assume that jz are monotone for every p and A. For pı < p», let 


Axp,p77 inf [aX (0G) = Ho (A Vs) 5) - (l) = llo (AV) =8)] 
£c(-1,1)4 V 
and assume that, for all p1 « p», 


inf A, > 0. 
ACS Pn P2 


For fixed p, < p», there exists by Proposition 3.4 an e > 0 such that (uP! , uP?) is 
both upward and downward ¢-movable. Next, by Lemma 5.1, there exists a t > 0 
such that 


p2, (—,£) p2 
H = Ming x , 


and therefore, 


(31) an uS 


DYNAMICAL STABILITY FOR IPS 565 
THEOREM 7.1. Consider the setup just described. Let A be an increasing 
event on (—1, 1]? and let A; be the event that A occurs at time t. 
(1) Let a € R. If 
pP (QA) = 1 
for all p € I with p > a, then 
VP (Ay occurs for every t) — 1 


for all p € I with p> a. 
(2) Let a ER. If 


pP (A) = 0 
for all p € I with p <a, then 
VP (Ar occurs for some t) =0 
for all p € I with p « a. 
PROOF. We prove only (1), as (2) is proved in an identical way. Take p > a 


and let p2 = (p + a)/2. By the argument leading toward (31), there exists t > 0 
such that 


uP (A) < pP (A). 
By using 472 (4) = 1 and 
[ine LG) x VP (A; occurs for every t € [0, 7]), 
we get by countable additivity that 
V (.4, occurs for every t) = 1. [] 


We will now be able to prove Theorem 1.1 easily. 


PROOF OF THEOREM 1.1. We prove only the very first statement; all the 
other statements are proved in a similar manner. We fix 6 > 0 and then A will 
correspond to our parameter p in the above set up. For any A € S, any s € A and 
any £ € (—1, 1)^V, we have that 


VATTEN "TER re 
(32) MA (oC) = llo(A Ns) =£) = 1e 280 ms EO)-2h 


where we let £(t) = 1 if t € A* in order to take the boundary condition into ac- 


count. It is obvious from (32) and the definition of monotonicity that n h is 


monotone for any A and A. Letting hı < h2, it is immediate that 


l 1 1 
Adj = mf f J e 2B(St sims 60)-2h — 1.4 TET mS 


Ec[-1,1])^ V 


566 E. I. BROMAN AND J. E. STEIF 


where again €(t) = 1 for all t£ € A‘. It is not hard to see that this strict inequality 
must hold uniformly in A, that is, 


inf A 0. 
Acs A,hy,ha 


It follows that all of the assumptions of Theorem 7.1 hold and part (1) of that result 
gives us what we want. |) 


PROOF OF LEMMA 1.2. Fix f > 0. Given any p ec (0, 1), it is easy to see 
that there exists a real number h2 such that, for all h > ho, for s € S and for all 
é e (-1, 1)9V, 


ut" Co(s) = llo(S\s)=§) x p 


and, hence, 7, < 1, "P^. Tt is also easy to see that there exists a real number h1 
such that, for all h < hy, for s € S and for all £ e (—1, 1]? V, 


iP (o(s) = Mo(SNVs) =E) < p 


and, hence, u^ P^ < Tp. The statements of the lemma easily follow from these 
facts. L 


8. Proof of Theorem 1.3. In this section we will use a variant of the so-called 
Peierls argument to prove Theorem 1.3. We prove this only for Z^; the proof (with 
more complicated topological details) can be carried out for Z? with d > 3. 

We will write 0 <> 3A L for the event that there exists a path of sites in 
state —1 connecting the origin to dA; := Az+; \ Ay at time £ and we will write 
0 «—5 oo for the event that there exists an infinite path of sites in state —1 con- 
taining the origin at time t. We will also write 0 <> 3A; and 0 <> oo for the 
obvious analogous events. We will first need Lemma 8.1 and the concept of a dual 
graph. The dual graph Gow om (soe pone) of Ga = (Sn, En) consists of the set 
of sites Soe) = {—n — L, „n+ ip and po which is the set of nearest neigh- 
bor pairs of Sua. In this paper we will only work with the edges of the dual graph. 
An edge e € po crosses one (and only one) edge f € E, and the end sites of this 
edge f will be called the sites (of G,,) associated to e. For a random spin configu- 
ration X on (—1, 1]?*, define a random edge configuration Y on (0, pe in the 
following way: 

_ fO, if X (t) = X(s), 
d is | L — idfX()ZzX(), 


where s, £ are the sites associated to edge e € E14, In Figure 1 we have drawn a 


configuration o € (—1, 1}5! and the induced edge configuration on (0, 1])£1 "^, 
Assume that the sites evolve according to the flip rate intensities 
(Cn(GS, 9))ses, o et-1,1)5- Consider y, a (finite) path of edges in the dual graph. 


DYNAMICAL STABILITY FOR IPS 567 





FIG. 1. S, and the edges of its dual graph. A solid circle marks a site with spin 1, while an empty 
circle has spin —1. A solid line is a present edge of the dual graph, and a dashed line is an absent 
edge of the dual graph. 


Take y’ to be a subset of y. Assume that all edges of y' are absent and all 
edges of y X y’ are present at t = 0. We want to estimate the probability of the 
event that all edges of y’ are present at some point (not necessarily all at the 
same time) during some time interval [0, t]. In other words, we want to estimate 


P(Ysup, c (y) = UYo(v’) = 0, Yoly Vy) = 1). 


LEMMA 8.1. Let {Ca(S,0)}ses, ce-1,1) be the flip rate intensities for a 
stationary Markov process on (—1, 1)?" and let Y, be defined as above. Let 


A i= sup C, (s, 0) (< oo). 
(5,0) 


For any t > 0 and any y! € Ewa, 
P(Ya,« (y^) = 1MYo(y^) = 0, Yo(Ed9 \ y^) = 1) < (4 —e Ar, 


PROOF. ‘Take t > 0. For every s € Sa, associate an independent Poisson 
process with parameter A. Define {X;};>9 in the following way. Let Xo ~ u 
and take t’ to be an arrival time for the Poisson process of a site s. We let 
Xp (5) Æ Xp.- (s) with probability C (s, X,.-)/A. Do this independently for all ar- 
rival times for all Poisson processes associated to the different sites. It is immediate 
that X, ~ yw. Let sj, i € (1, ...,1], be distinct sites of S,. The event {Xing (si) 4 
Xsup,z Gi) Vi € (1, ...,1]] is contained in the event that every Poisson process as- 
sociated to the sites s;,i € {1,...,/}, have had at least one arrival by time v. The 


568 E. I. BROMAN AND J. E. STEIF 


probability that a particular site has had an arrival by time t is 1 — e ^^". Fur- 
thermore, this event is independent of the Poisson processes for all other sites. 
Therefore, 


(34) P(Xing.c (5i) € Xup,c(si) Vi € (1,...,1]) x A — e ^^y. 


Given y’, consider the set of all sites associated to some edge of y’ and let n, be 
the cardinality of that set. Observe that n; < 2|y'| and that in order for the event 
(Ysup,c (y^) = 1]Yoy/) = 0, Yo(ES# V y’) = 1) to occur, at least |y’|/4 of the sites 
associated to y^ must flip during [0, v]. This is because one site is associated to 
at most 4 edges. Denote the event that at least |y’|/4 of the sites associated to y’ 
flip during [0, 7] by 4; ,;. Take S to be a subset of the sites associated to y’ such 


that |S] > |y//|/4. By (34), the probability that all of these sites flip during [0, t] is 


less than (1 — e-^*)l5! < (1 — e-^*)lvY'l/4 To conclude, observe that the number of 
subsets of the sites associated to y’ is bounded by 2?!” |. Hence, the probability of 
the event “A, y, must be less than (1 — ev 422lY'l. and so 


P(Ysup,c (y^) = LYo(y’) = 0, Yo(EQ9 \ y^) = 1) 
< PA; y) < (à e yay Ez 


PROOF OF THEOREM 1.3. We will prove the theorem for d = 2. For B > Bp, 
choose 6; > 0 so that f' := p> > p and, hence, 


ae / 
Y isl 16771. oo, 
l=1 


Next, choose N and e < 1/2 such that 4 < ô, and e <e~P?-*)) and let t be 
such that e = 4(1 — e~4*)!/4. Let 8 > 0 be arbitrary and choose L so that 


oo 
35 nu es. 
l=L 


Let 6; . be the event that 0 PLET L, for some £ € [O, t]. Let wr P be defined as 
in Section 2.3. We will show that 


wP(gp.)«8 Wn>L. 


Since W;PP (EL) > W^ (6; .) (see Section 2.3) we get that Y8 (8;, ,) < à. 
Letting L — oo and à — 0, we get that 


wt Ft e[0, r]:0 <> oo) — 0, 
and then by countable additivity, 


wt Fat » 0:0 «S oo) — 0. 


DYNAMICAL STABILITY FOR IPS 560 


It is well known (see [8]) that if all sites in A,+1 V A, take the value +1, 


Si, C(ayc po t € [0, z1: ly | > L, y surrounds the origin, Y; (y) = 1} 
(35) 
C {ay C EM. |y| > L, y surrounds the origin, Ysup,r (y) = 1]. 
To prove wr P (E€L,z)} < 5, consider y with |y | = 1 a contour in pun surround- 
ing the origin. By Lemma 8.1, P(Ysup,r (y^) = 1]lYo(y) &0, Yo(vy\v) =D < gr 
whenever y’ C y. We get 


P(Ysup,c Cy) = 1) 


l 
=>), 2, P(Yo( =, Yoy Ny) = 1) 
k=O y'&y 
lyk 


X P(Yeup,r (y) &1|Yo(y) =0, Yoly \y) = 1) 


l 
(36) <}, }, P(Yo(y) «0, Yo(y Vy = 1)e" 
k=0 y’Cy 
lyk 
l 
= > P({all edges except k of y are present at t = 0})e“ 
k=0 
I/N 
= > P({all edges except k of y are present at t = 0})e* 
k=0 


l 
+ p» P({all edges except k of y are present at t = 0])e*. 
k=l/N+1 


Obviously, //N need not be an integer, but correcting for this is trivial and is left 
for the reader. 
We need to estimate P({all edges except k of y are present at t = 0}). For this 
purpose, define T : (—1, 1)» — [—1, 1), by 
(Te)G) | o(s), if s is not in the domain bounded by y, 
Oo = 
i —o (s), if s is in the domain bounded by y, 


for all o € (—1, 1)?. Let Ej = (o : all edges except k of y are present}. Since 
Hl P of (6) gives a contribution of —f for adjacent pairs of equal spin and +f 
for adjacent pairs of unequal spin, we have that, for o € Ex, Hé P (To) = 
Hy"? (o) — 2B(y| — k) +28k = Hy"? (o) — 2Bly| + 46k. 
Hence, for o € Ex, 
g Hà 0) e Hi’  (To)-28ly HABk 


lg Hess coe eee 


570 E. I. BROMAN AND J. E. STEIF 


and so 
4 2Bl--ABk g- Br (To) 
uL (Ex) = up (o) = e MPP SÉ 
o cE, CEE, 
g- Fa (To) 
< g-28ABk Y 5 - e 2BI+4Bk 
c €(—1,1)5n 


where the last equality follows from T being bijective. We then get that 


L/N 
>» P({all edges except k of y are present at t = O})e* 
k=0 
L/N l/N 
(37) < Y eg 2BI+4Bk ok < 9 2Bl+4B1/N Y gk < 2g  2BLHABUIN 
k=0 k=0 


« 29 BQ—-5)I — 26 2B 1 


Furthermore, 


l 
> P({all edges except k of y are present at t = Ope 
k=l/N+1 


l 
(38) < gl/N > P({all edges except k of y are present at t = 0]) 
k=i/N-+1 


< el/ N < ePOS _. 926 


where we use that {all edges except k of y are present at t = 0} are disjoint events 
for different k. Hence, (36), (37) and (38) combined give us 


P(Ys, c (y) = 1) < 3e?! 
and so by (35), for all n > L, 
y^P (er .) « y. Py C poa : |y] = L, y surrounds the origin, Ys, (y) = 1) 
n / 
< Y 137136 781 < 6, 
l=L 
where the second to last inequality follows from the fact that the number of con- 


tours around the origin of length / is at most 13/—! (see [8]. O 


REMARK. For ZZ, the proof is generalized by noting that the number of con- 
nected surfaces of size / surrounding the origin is at most C(d)', for some con- 
stant C (d). The arguments are the same but the "topological details" are messier. 


DYNAMICAL STABILITY FOR IPS 571 


9. Proof of Theorem 1.5. We will start this subsection by presenting a theo- 
rem by Liggett, Schonmann and Stacey [21]. 


THEOREM 9.1. Let G = (S, E) be a graph with a countable set of sites in 
which every site has degree at most A > 1, and in which every finite connected 
component of G contains a site of degree strictly less than ^. Let p,a,r € [0, 1], 
q = 1-— p, and suppose that 


a-a) — r)^^! » q, 
(1 —a)a^7! >q. 


If p € G(p), then Tar < p. In particular, if q < (A — 1)^71 /A^, then Jp X H, 
where 


1/A 
q 
p= (1 Apes - (a^ - 0) ^). 


Here G(p) denotes the set of probability measures on (—1, 1]? such that if 
u € G(p), X ~ u, then for any site s € $, 


P[XG)-1le((X(O:(s, £E]))]z p — as. 


Observe that when p — 1 = q — 0 and so p — 1. The above theorem is stated as 
the original in [21]. However, by considering the line-graph of G = (S, E), it can 
be restated in the following way. 


COROLLARY 9.2. Let G = (Š, E) be any countable graph of degree at 
most A. For each 0 < p < 1, there exists a 0 < p < 1, where p = p(A, p) such 
that if Y ^ v, where v is a probability measure on the edges of G such that for 
every edge e € E, 


P[Y(e) = llo ((YCf):e7 f) = p a.S., 


we have that mE <v. 


By e % f, we of course mean that the edges e and f do not have any endpoints 


in common. Here, x A is the product measure with density p on the edges of G. 

Consider a graph G = (S, E) and a subgraph G’ = (S’, E^), where S’ = S and 
E' C E. Let X ~ zy on S. We declare an edge e € E’ to be closed if any of the 
endpoints takes the value O under X. Corollary 9.2 gives us that, for any o < 1, 
there is a p « 1 such that this method of closing edges dominates independent 
bond percolation with density p on E’. Observe that we can choose p independent 
of E' since the maximal degree of E' is bounded above by the maximal degree 
of E. 


572 E. I. BROMAN AND J. E. STEIF 


Let (X, Y) ~ P7, defined in Section 2.5. Close every e € E, such that Y (e) = 1 
independently with probability €, thus creating (X, Y ^*)). Compare this to clos- 
ing every site in $n independently with parameter &' [creating X (7,8) and defining 


l, if Y (e) = 1 and neither one of the endpoints of e flips, 
0, otherwise. 


Y" (e) = | 


By the arguments of the last paragraph, we see that, for a fixed e, there exists an £’ 
[that we can choose independent of (X, Y) and n] such that the first way (i.e., 
independent bond percolation) of removing edges is stochastically dominated by 
the latter. Hence, 


PP ((x, Y€-9) e ({-1, 1}, )1(X, Y)) 
< P^ ((XC^9)), Y*) e ((-1, 1), A(X, Y)). 


By averaging over all possible (X, Y), the next lemma follows. 


LEMMA 9.3. With notation as above, for any e > 0, there exists &' > 0 inde- 
pendent of n such that 


P? ((X, YO) e ((-1, 1}, )) < PP((X*), Y*) e ((—-1, 1)9, )). 


Observe that 
(39) P? ((X, Y€9) e ((-1, 1)9, )) 2529) 
and that 
(40) P? ((X C5), ye) e c, {-1, 145) C6, 


We are now ready to prove Theorem 1.5. 
PROOF OF THEOREM 1.5. For any choice of f > Be, take p = 1 — e7?’ and 
let 8 € (0, p — pc). Now, (14) and Holley's inequality imply that 
DP- «9?  VneN*. 
Since, by (14), both v; ? and oP are monotone, there exists by Lemma 3.3 (it is 


easy to check that all other coud aon of that lemma are satisfied) an ¢ > 0 such 
that 


(41) DP <P)  vneN*. 
In [13] they show that the limit lim, 57 ^ (0 <> 9 A4) exists and that 


(42) lim 2^ (0 <> BAn) > 0. 


DYNAMICAL STABILITY FOR IPS 573 


Here (0 <— 9 A4] denotes the event that there exists a path of present edges 
connecting the origin to 0A, :— An+1 V An. Since (0 «—» JAn} is an increasing 
event on the edges, Lemma 9.3 guarantees the existence of an &' > 0 such that 


pP C9 (0 <> 8 A4) 
= PP((x, YC€^9) e ((—1,1)9,0 <= 8A,)) 
«PP((xC9.,y*)e(—1,19,0— 39A,) — VneN*. 


If there exists a path of present edges connecting the origin to the boundary 0A, 
under Y, all the sites of this path must have the value 1 under X. Similarly for 
(X ^8), Y*), if there exists a path of present edges connecting the origin to the 
boundary 0A, under Y all the sites of this path must have the value 1 un- 
der X ^5), Hence, 


P? ((X C9), Y*) e ((—1, 1)?,0 <> 3A4)) 
= PP ((XC^55, y*) e (0 <> 8A4,0 <> 8A,4)) 
« P? (x9), y*) e (0 <=> 8A,, (0, 1)72)) 
= ubPC (9 9A,). 
Of course, 
u PCs (0 PRU An) < ut C6) (o es dAz) VL<n. 
: Therefore, for any L, we have that 


0 < lim EP’ (0 <> 3An) 


< limp PO (9 3, A1) = iO + aan), 
and so 


0 < lim urbe c aAL) = y^P C909 00), 


The limit in L exists since (0 <> 3A7,) C (0 <-> 9Az,} for Lı < L2. Since 
ut is ergodic (see [19], pages 143 and 195), it follows that y ^9 C^*? must also 
be ergodic. This is because L^ C^*? can be expressed as a function of two inde- 
pendent processes, one being z+’? and the other a product measure. We conclude 
that 


(43) pwr Be) (e+) 1. 
By Lemma 5.1, there exists a t > O such that 


uoo) ~< TES 


inf,t 


574 E. I. BROMAN AND J. E. STEIF 


and therefore, 
bint (C*) = 1. 
Therefore, 
wtb (Cr occurs for every t € [0, r] = 1. 
Finally, using countable additivity, 
wt8(@* occurs for every t) = 1. 0 


10. Proof of Theorem 1.4. The aim of this section is to prove Theorem 1.4. 
For that we will use Theorem 1.5 and Lemma 10.1. We will not prove Lemma 10.1 
since it follows immediately from the proof of Lemma 11.12 in [10] due to 
Y. Zhang. 

A probability measure 4 on (—1, 1]? is said to have the finite energy property 
if all conditional probabilities on finite sets are strictly positive. 


LEMMA 10.1. Take u to be any probability measure on (—1,1]2 which 
has positive correlations and the finite energy property. Assume further that y is 
invariant under translations, rotations and reflections in the coordinate axes. If 
u(C*) = 1, then u (G^) — 0. 


PROOF OF THEOREM 1.4. Fix f > ße. By (43), there exists € > O such that 
p P Cet) = 


Since z+’? and x1—s both have positive correlations, it follows that j, 5 79 has 
positive correlations. This is because (see [19], page 78) the product of two prob- 
ability measures which have positive correlations also has positive correlations. 
Furthermore, a collection of increasing functions of random variables which have 
positive correlations also has positive correlations. In addition, the finite energy 
property is easily seen to hold for po? C^ 9. Using this, we can by Lemma 10.1 
conclude that 


uo P C97) — 0. 
By Lemma 5.1, there exists a t > 0 such that jj ^^ C9) < iy and hence, 
Hinge (C7) = 0. 
It follows that 
y ^s (dt € [0, tT]: C; occurs) = 0, 
and by countable additivity, we conclude 
wt? (dt 0:G;7 occurs) — 0. o 


DYNAMICAL STABILITY FOR IPS 575 


Acknowledgment. We thank the referee for a very careful reading and for 
providing a number of suggestions. 


REFERENCES 


[1] AIZENMAN, M., BRICMONT, J. and LEBOWITZ, J. L. (1987). Percolation of the minority 
spins in high-dimensional ising models. J. Statist. Phys. 49 859. 
[2] VAN DEN BERG, J., MEESTER, R. and WHITE, D. G. (1997). Dynamic Boolean models. 
Stochastic Process. Appl. 69 247—257. MR1472953 
[3] BRICMONT, J., LEBOWITZ, J. L. and MAES, C. (1987). Percolation in strongly correlated 
systems: The massless Gaussian field. J. Statist. Phys. 48 1249-1268. MR0914444 
[4] BROMAN, E. I., HÄGGSTRÖM, O. and STEIF, J. E. (2006). Refinements of stochastic domi- 
nation. Probab. Theory Related Fields. To appear. 
[5] CONIGLIO, A., NAPPI, C. R., PERUGGI, F. and RUSSO, L. (1976). Percolation and phase 
transitions in the Ising model. Comm. Math. Phys. 51 315—323. MR0426745 
[6] DURRETT, R. (1988). Lecture Notes on Particle Systems and Percolation. Wadsworth and 
Brooks/Cole Advanced Books and Software, Pacific Grove, CA. MR0940469 
[7] FORTUIN, C. M., KASTELEYN, P. W. and GINIBRE, J. (1971). Correlation inequalities on 
some partially ordered sets. Comm. Math. Phys. 22 89-103. MR0309498 
[8] GEORGI, H.-O. (1988). Gibbs Measures and Phase Transitions. de Gruyter, Berlin. 
MR0956646 
[9] GEORGII, H.-O., HÄGGSTRÖM, O. and Mass, C. (2001). The random geometry of equi- 
librium phases. In Phase Transitions and Critical Phenomena 18 (C. Domb and 
J. L. Lebowitz, eds.) 1-142. Academic Press, San Diego, CA. MR2014387 
[10] GRIMMETT, G. (1999). Percolation, 2nd ed. Springer, Berlin. MR1707339 
[11] HÄGGSTRÖM, O. (1996). The random-cluster model on a homogeneous tree. Probab. Theory 
Related Fields 104 231—253. MR1373377 
[12] HAGGSTROM, O. (1997). Infinite clusters in dependent automorphism invariant percolation on 
trees. Ann. Probab. 25 1423-1436. MR1457624 
[13] HAGGSTROM, O. (1998). Random-cluster representations in the study of phase transitions. 
Markov Process. Related Fields 4275-321. MR1670023 
[14] HÄGGSTRÖM, O. (2000). Markov random fields and percolation on general graphs. Adv. in 
Appl. Probab. 32 39-66. MR1765172 
[15] HÄGGSTRÖM, O. PERES, Y. and STEIF, J. E. (1997). Dynamical percolation. Ann. Inst. 
H. Poincare Probab. Statist. 33 497-528. MR1465800 
[16] HIGUCHI, Y. (1993). Coexistence of infinite (+)-clusters. H. Ising percolation in two dimen- 
sions. Probab. Theory Related Fields 97 1-33. MR1240714 
[17] HiGucuI, Y. (1993). A sharp transition for the two-dimensional Ising percolation. Probab. 
Theory Related Fields 97 489-514. MR1246977 
[18] HoLLEY, R. (1974). Remarks on the FKG inequalities. Comm. Math. Phys. 36 227—231. 
MR0341552 
[19] LIGGETT, T. M. (1985). Interacting Particle Systems. Springer, New York. MR0776231 
[20] LIGGETT, T. M. (1994). Survival and coexistence in interacting particle systems. In Probability 
and Phase Transition (G. Grimmett, ed.) 209-226. Kluwer, Dordrecht. MR1283183 
[21] LiGGETT, T. M., SCHONMANN, R. H. and STACEY, A. M. (1997). Domination by product 
measures. Ann. Probab. 25 71—95. MR1428500 
[22] LiGGETT, T. M. and STEIF, J. E. (2006). Stochastic domination: The contact process, Ising 
models and FKG measures. Ann. Inst. H. Poincaré Probab. Statist. To appear. 
[23] SCHRAMM, O. and STEIF, J. E. (2005). Quantitative noise sensitivity and exceptional times 
for percolation. Preprint. 


576 E. I. BROMAN AND J. E. STEIF 


[24] SHIELDS, P. C. (1996). The Ergodic Theory of Discrete Sample Paths. Amer. Math. Soc., 


Providence, RI. MR1400225 


DEPARTMENT OF MATHEMATICS 
CHALMERS UNIVERSITY OF TECHNOLOGY 
412 96 GOTHENBURG 
SWEDEN 
E-MAIL: broman ? math.chalmers.se 
steif 2 math.chalmers.se 

URL: www.math.chalmers.se/~broman 

www.math.chalmers.se/~steif 


The Annals of Probability 

2006, Vol. 34, No. 2, 577-592 

DOI: 10.1214/009117905000000819 

© Institute of Mathematical Statistics, 2006 


SINGULARITY POINTS FOR FIRST PASSAGE PERCOLATION 


By J. E. YUKICH! AND YU ZHANG? 
Lehigh University and University of Colorado 


Let 0 < a < b < oo be fixed scalars. Assign independently to each edge 
in the lattice Z? the value a with probability p or the value b with probability 
| — p. For all u,v € Z?, let T (u, v) denote the first passage time between 
u and v. We show that there are points x € R? such that the “time constant" 
in the direction of x, namely, lim, oo n-lE pl (0, nx)], is not a three times 
differentiable function of p. 


1. Introduction, main results. Consider the following simple model of first 
passage percolation. E :— E(Z*) denotes the edges in the integer lattice Z’,0< 
a <b < œ are fixed scalars, and Q := (a, b}”. For all e € E and we € Q, Pla = 
a] — p and P[w, = b] = 1 — p, where 0 < p < 1. In other words, we assign either 
a or b to each edge with probability p or 1 — p independently from the other edges. 
Denote the product measure on $2 by P and the expectation with respect to P; 
by E,. 

For all u, v € Z?, let T (u, v) denote the first passage time between u and v. 
Formally, T (u, v) is the infimum of 5 /,c,, We, where y ranges over all finite paths 


in Z? from u to v. If x and y are in R7, we define T (x, y) = T(x’, y^), where x’ 
(resp. y^) is the point in Z? closest to x (resp. y). Any possible ambiguity can be 
avoided by ordering Z and taking the point in Z* smallest for this order. 

Let 0 denote the origin of R? and for all x € R?, let T (x) :— T (0, x) be the first 
passage time between 0 and x. It is well known by Kingman’s subadditive ergodic 
theorem ((1.13) of [9]) that, for all x € IR?, there is a constant ps p(x), such that 


lim T (nx) 


. 1 
= x a.s. and in L5. 
noQ r Mp ( ) 


(1.1) 





When x = (1,0), the limit u} :— “p((1,0)) is called the time constant of 
Hammersley and Welsh [8]. Without loss generality, for any x € R?, we also call 
A. p(x) the time constant in the direction of x. 

In general, physicists believe that most percolation constants should be real ana- 
lytic as functions of p, excepting the singularities at the critical case. In particular, 


Received September 2004; revised June 2005. 
l Supported in part by NSA Grant MDA904-01-1-0029 and NSF Grant DMS-02-03720. 
2Supported in part by NSF Grant DMS-04-05150. 
AMS 2000 subject classification. 60K35. 
Key words and phrases. First passage percolation, shape theory, the right-hand edge, nondiffer- 
entiability of time constants. 


5T] 


578 J. E. YUKICH AND Y. ZHANG 


when c, only takes value 1 or 0, the behavior of the time constant is similar to 
that of the correlation length [1]. Furthermore, the analyticity of the correlation 
length, as expected, is proved for all p except for the critical case when d — 2 [2]. 
Few rigorous results are known for the time constant. Cox and Kesten (Theorem 3 
of [4D show that Ap is continuous with respect to the weak convergence of the 
distribution of the passage times, from which it follows that 14/7 is continuous in p. 
With these observations, one might believe that both the correlation length and 
the time constant are analytic except for the critical case when we takes the val- 
ues 1 or 0. Furthermore, one might also expect that the behavior of the time con- 
stant in the critical case is similar to the behavior in the case when c, takes the 
values a or b with 0 < a < b. We find here that the analyticity of the latter is not al- 
ways true. The main goal of this paper is to show there is a direction for which the 
directional asymptotic speed is not three times differentiable in the parameter p. 
Recall that the classical grid £ for oriented percolation is given by £ := 
((m, n) € Z2 :m + n has even parity, n > 0). Thus, £ is Z? rotated by 7/4 and 
correctly dilated. Let E(£) be the edges from (m, n) € £ to (m + 1, n + 1) and to 
(m —1,n+ 1). To each edge e € E(£), we assign a passage time a > 0 with prob- 
ability p and a time b > a with probability 1 — p. Henceforth, let Q :— (a, b) E CO, 
Let P. denote the critical probability for oriented Bernoulli percolation on £. 
For all p € (Des 1], consider all paths starting from ((x, y) C Z*:x <0, y = 0} in 
the oriented graph using n type a oriented edges E(£) and let (7,(p),n) denote 
the rightmost point (“right-hand edge”) of all such paths. We will often simply 
refer to the scalar r,;,(p) as the right-hand edge. In the super-critical regime p € 


(Pe, 1], the rightmost point (r„ (p), n) satisfies 


. Fn(p) i 1 
(1.2) um rcu = a(p) a.s. and in L`, 
as well as a central limit theorem [10]. Here a(p) € (0, 1] is called the asymptotic 
speed of super-critical oriented percolation on the edges of £. It describes the drift 
of the rightmost point at level n. 

If p > De, then the asymptotic shape [the unit radius ball for the norm induced 
by the map x — gu p(x)] exhibits a flat edge [6], which is related directly to the 
possibility of percolating with edges having passage time a. The flat edges of the 
asymptotic shaper are in the coordinate directions and are described analytically 
by Marchand [12] (see especially Theorem 1.3). 

Let po € (De, 1) be fixed. For all p € (De, 1), define a time constant in the di- 
rection of the critical vector with components a(po) and 1, that is, set 


E [T l 
fpo (P) = lim Fpl Rp n 


It is easy to see (cf. Lemma 3.3 below for details) that if p > po, then on the 
average there is an oriented path between 0 and (ov (po)n, n) consisting of edges 


FIRST PASSAGE PERCOLATION 579 


having passage time a, that is, fp, (p) =a for all p € [po, 1]. Thus, if p —— 
fps (p) is three times differentiable at p = po, then the third derivative must be 
zero. However, in what follows, we show there is a constant C > 0 such that, for 
all p € (Des Po), we have 


(1.3) fp (P) = a + C (po — p)*/(—log(po — p)). 


This is enough to show that p> fp(p) is not three times differentiable at po. 
This is our main result, formally stated as follows: 


THEOREM 1.1. For all po € (De, 1), the function p +—> fy) (p) is not three 
times differentiable at p = po. 


REMARKS. 1. Hammersley and Welsh conjecture (Corollary 6.5.5 of [8]) that 
A» is concave in p and thus differentiable for almost all p. One might also expect 
that p —» fj, (p) is concave and differentiable, but we are unable to show it. 

2. Theorem 1.1 can be generalized to include passage times having a common 
distribution pd, + (1 — p)U (D), where 0 « a < b, p € [0, 1], and U (b) is an inde- 
pendent random variable bounded below by b. It is unclear (at least to us) whether 
Theorem 1.1 remains true for (1) more general passage times, or (ii) directions 
other than (o (po)n, n). It is also unclear whether the lower bound (1.3) can be 
improved to fp (p) = a+ C(po — p)/(—log(po — p)). 

3. A natural problem involves studying the properties of the asymptotic shape 
at the end of its flat edge for a fixed p. Our methods do not yield any information 
here. 


2. Probability bounds for the right-hand edge of super-critical percolation. 
The following proposition is of independent interest and provides exponential tail 


bounds for the right-hand edge r,(p), p € (De, 1]. We will make critical use of 
this estimate in the sequel, but for now we note that Proposition 2.1 should be 
compared with the general tail bounds of Kuczek and Crank [11] (Theorem 1, 


part 1), who show, for all p € (De, 1] and all O < e < 1, that there are constants 
Kı := Ki(p, €) and K2 :— K»(p, £) such that, for all n = 1,2,..., 


P [rn (p) = (a(p) + e)n] < Kin V? exp(— Kan). 
PROPOSITION2.1. Forallqe (De, 1], there exists Cy := Ci1(q) > 0 such that 
forall0<e<1,all p € [q, 1], and all n = 1,2,..., 
Ppira(p) = (a(p) - e)n] x Cin exp( —&^n/ C1). 


The proof of Proposition 2.1 involves consideration of the renewal process aris- 
ing by breaking the behavior of the rightmost point r, (p) into independent pieces, 


580 J. E. YUKICH AND Y. ZHANG 


an approach developed by Kuczek [10]. Our methods require an exponential decay 
result on the size of a finite cluster in super-critical oriented percolation [5]. 

Before proving Proposition 2.1, we require some terminology [10] and a lemma. 
Given vertices u and v in £, we say u — v if there is a sequence vo = 
H,U],..., Ug =v Of points of £ with vj :— (xj, yj) and vi, t= (xiii, yi + 1) 
for 0 <i € m — 1 such that v; and vj; are connected by an edge with weight a. 
Thus, u — v if there is a sequence of oriented edges each with weight a joining u 
to v. For A C Z, let 


E^ = (x : (x, n) € £ and 3x' € A such that (x/, 0) > (x, n) for n > 0}. 
As in [10], denote the event that there exists an infinite oriented path of a edges 
starting from (x, y) by Qe» ) We let f= ve :— {0} and set 
0,0 -¢ + (0,0 
i -[* ig £0, 
{1}, otherwise, 
and define inductively, for all n = 1,2,..., 
[x : (x, n 4- 1) € ££ and 
but (y, n) > (x, n + 1) for some y € &’}, if this set is nonempty, 
in + 1}, otherwise. 


We have suppressed the dependence of £f on p for notational convenience. Note 
that £; is a subset of the integers between —n and n. Let 


r' (p) :-sup(x:x € £f}. 


On [et Æ Ø}, we have equivalence between r/,(p) and the right-hand edge 
Fa(p). A vertex (x,n) € £ is said to be a percolation point if and only if the 
event 2%”) occurs. Let 


T, :— inf(n > 1: (r,, n) is a percolation point}, 


T; := inf(n > Tj + 1: (r,, n) is a percolation point}, 


Tm := inf(n > Tm-1 4-1:(r07, n) is a percolation point}, 
where we make the convention that inf Ø = oo. Define 
Tp iss). T) = [5 — Ti,..., Um t= Im — Im, 


where 7; :— 0 if T; and 7;— are infinite. (Note that T; and 7;.., are finite with 
probability one.) Also define 


PUTEM 7 pana f 7 — 
X1 =T > X2 = rh S TTo e Xm = Ta IT 


FIRST PASSAGE PERCOLATION 581 


where X; := 0 if T; = oo and 7;.., = co. The points (rz, T;)} are called break 
points [10] since they break the behavior of the right-hand edge into i.i.d. pieces 
when the origin is a percolation point. Kuczek (Theorem on page 1324, [10]) 
proved that, conditional on Qf 2) , {(X;, vj)) are 1.i.d. with all moments. More- 


over, for all g € (p. Dc, 1], there exists a positive constant C5 :— C2(q) such that, for 
all p € [q, 1] and all t > 1, 


Qi) — Ppin >t] <P [ELP +ø, (1,1) A oo] < Crexp(—t/Cr), 


where the last inequality is as in [5], Section 12. 
If we set 


Nn = sm Dr <n} 


then ry,+1 is the location of the right-hand edge at the first "regeneration point" 
after time n. By considering |ry,4.1 — ry, | and |r, — ry, |, it easily follows that 


(2.2) Irn, 1 — Fn] E ZTN, 41 


(see page 1331, [10] for details). 
To prove Proposition 2.1, we make use of the following probability measure 
on Q: 


P E] := P [.]2€99. 


Let E p denote the expected value with respect to P p. If the event {ra (p) > (a(p) + 
€)n} occurs for a particular configuration w € Q of edges, then it also occurs for 
any configuration c» whose a edges are a superset of the a edges in w. Thus, the 


event {r,(p) > (a(p) + £)n} is increasing. Similarly, Qg is an increasing event 
so that, by the FKG inequality, 


P,[GQ 9 IP, [rn (p) = (alp) + e)n] < Pp[ra(p) > (alp) + e)n, 9,9], 
that is, to say, 


P [rn (p) > (a(p) + e)n] < Pp[ra(p) = (alp) + e)n]. 


LEMMA 2.1. Letqc (De, 1]. There exists C3 :— C3(q) such that for all 0 < 
£&«Lallpelg,1], andalln=1,2,..., 


(2.3) Pp[zw,,, = en] < Can exp(—en/C3). 


We defer the proof of Lemma 2.1 and instead show how it implies Proposi- 
tion 2.1. For convenience, we put o := a (p) and ry, :— ra (p). 


582 J. E. YUKICH AND Y. ZHANG 


PROOF OF PROPOSITION 2.1. By the definition of N, and (2.2) we have, for 
allQ<e<1landalln=1,2,..., 


Pr, > (o + &)n] x Pp[rs = (a + e)n] 
< P irn, 1 + 2tw,,, > (@+e)n] 
< Pp[ru, i = (@ + /2)n] +Pp[tw,4, = 61/4]. 
By Lemma 2.1 and since a < 1, the above is bounded by 
24) <P,[X1+---+Xy,41 Z a(l + e/2)n] + Can exp( —n/4C3). 


Put kK := k(p) := Epit] and note that « > 1 by definition of tı. For n > x, let 
m := |Ż(1 + e/4)], where, for all x € R, |x] denotes the greatest integer less than 
or equal to x. It follows that the above is less than or equal to 


m 
XOP [Xi +- + Xi x all c 6/2)n] - PLNS - 1 2 m 4 1] 
i-i 

+ Can exp(—en/4C3). 


Denote the first two terms in the above inequality by 7 and JJ. For simplicity, 
we put Y; :— x — tj. Thus, by definition of «, 


m m 
II :— P [Na 4-1 DP <n| -P| Ye —Yj) <n, 
j=l E 


nm 
«bn >K(n/K + en/Ak — 1) -n 


j=l 


I 


m 
apa ze 


j=l 
By Markov’s inequality, for all r > 0, 


m 
(2.5) I] < exp(rk)exp(—ren /4)Ep exp ( > y) 
a 


Since E pl Y1] = 0 and since all moments of Y; exist, it follows that, for all p € 
[q, 1], there exists C4 :— C4(g) such that log Ej[exp(rY1)] < Car? if r < rg := 
ro(q). Thus, for r < ro(q), we obtain 


II x exp(rk —ren/4+ C4mr?). 


If we let r :— ex/C and increase C if necessary, then it follows that there exists 
C5 :— Cs5(q) such that, for all 0 < e < 1, all n > « and p € [g, 1], 


(2.6) II < Csexp(—&?n/C3). 


FIRST PASSAGE PERCOLATION 383 


Increasing the value of Cs if necessary, we see that (2.6) holds for n € [1, x] as 
well. 

Now we bound term J. By Lemma 1 of [13], we know o = Ep X1/« and thus, 
by definition of m, we have, for all 1 <i <m, 


: " E,X 
[Xi +- + Xi] = iÈ X; x n— — (1 + e/4) 





— an(l-- 8/4). 
Thus, 


v. 


jal 


P| Det EpX;) > an(l 4/2) - anl ej4) 


m i 
= iiis —E,Xj)> anja | 
i=l 


j=l 


Since |X j| < |vj| for all j <i, where i < m x 2n, we may follow the approach 
used for the bound (2.6) to conclude that there exists Ce :== Cg(q) such that, for all 
O<e<1, péElg,1],andalln=1,2,..., 


(2.7) I < Cenexp(—e7n/ Cg). 
Recalling that 
Pp[ra = (Œ + £)n] € I +H + Can exp(—8n/A4C3) 
and applying the bounds (2.6) and (2.7), we obtain Proposition 2.1 as desired. [C] 


Now it remains to show Lemma 2.1. 


PROOF OF LEMMA 2.1. By definition of N,, we have, for all 0 < e < 1, all 
p € (pe, 1], and all a — 1,2, ..., 


Pp[tw, +1 > en] = Ys > en, N, =å] 


oo i i+] 
=) P, |t; > en, Tk ZA, hon 
YP pl asi Sem Y usn n>n] 


k=l k=] 


oo i i 
» YP push asmua]; 
j k=] k=1 


584 J. E. YUKICH AND Y. ZHANG 


Under the measure P,, the {t;} are independent and, thus, the above equals 


Y Bua nt, PES Yan | 


jzen 


z Ppl > P [Xs pide i 


jzen ix2n/k 


ty | X) aan i 


i»2n/k k= 
c= 4+ H. 


Let us bound 77. Notice that if i > 2n/x, then ik — n > ik/2, so we have 


P,| Dou <n|=P j| X 02i -n| <P, [Xe-wsz| 
k=] 


k=1 


By the methods used to obtain (2.6), there exists C7 :— C7(q) and Cg :— Cg(q) 
such that, for all p € [g, 1] and all n — 1,2, ..., 


(2.8) Hx Y Criexp(-i/Cj) < Caexp(-n/ Cy). 


i>n/k+n 


Let us bound term 7. The second factor in J is bounded by the number of sum- 
mands sbowing that 


Ix (= " 2) > Pp[ti = 7 | < 2nP [11 > > Een, 


jzen 


since « > 1. Combining this with (2.1) shows that there exists C9 :— Co(g) such 
that, for all 0 < € < 1, all p € [g, 1], and all n = 1,2, ..., 


I < Con exp(—en/ C9). 


Lemma 2.1 now follows from (2.8) and the above inequality. C 


3. Auxiliary lemmas. The proof of Theorem 1.1 rests on the upper bound 
for the right-hand edge of supercritical percolation (Proposition 2.1), as well as a 
lower bound for first passage times, given in the upcoming Proposition 4.1. Be- 
fore proving the latter, we require six straightforward lemmas. Our first lemma 
gives a way to prove the asserted nondifferentiability of fpọ, where we recall that 


Po € ¢ Pes 1) is fixed once and for all. Let log denote the natural logarithm. For the 
remainder of the paper, we fix q € (Des po). 


FIRST PASSAGE PERCOLATION 585 


LEMMA 3.1. Suppose h:[0,1] > R* satisfies h(p) = 0 for all p > po. If 
there exists à := 6(q) > O such that, for all p € iq, po). 


8(po — pY 
4 nc e 
eM (P) = eat /(po — P) 


then h" (po) does not exist. 


PROOF. We use elementary calculus. If h’’(po) did exist, then necessarily 
h" (pg) = h"(po) = h'(po) = 0. It follows that |h"(p)| = Ih"(p) — h” (po)| < 


|Po — pl if |p — po] is small enough. For such p, we have |h’(p)| = | f], A" (u) dul < 


f a (u)| du < (po — p)’, that is, h’(p) grows at most like a quadratic in po — p. 
Similarly, h(p) grows at most like a cubic in po — p for |p — po| small enough. 
This is a contradiction. LJ 


To show that the function fj, of Theorem 1.1 satisfies the conditions of 
Lemma 3.1, we will need several more lemmas and a proposition. 


LEMMA 3.2. For all p € (De, pol, we have a(po) — a (p) > 2(po — p). 
PROOF. See [5], page 1006, display (12). LJ 
LEMMA 3.3. fy. (p) =a for all p € [po, 1]. 


PROOF. By the central limit theorem of Kuczek (Corollary 1 of [10]), with 
probability 1 — o(1), there exists an oriented path y of n type a edges, starting 
at 0 and terminating at a point (x, n), where a(po)n « x. Similarly, reversing the 
orientation of the edges, with probability 1 — o(1), there exists a path y’ of n 
type a oriented edges, starting at (a (po)n, n) and terminating at a point (s, 0), 
where s > a (po)n. The paths y and y’ intersect at some point Q € Z^. Let y; be 
the restriction of y between 0 and Q; let y; be the restriction of y^ between Q 
and (o (po)n, n). Let y, be the union of y; and yj. Then y, is an oriented path 
0 — Q — (a(po)n, n) consisting exclusively of n type a edges showing that 


(3.2) T ((a(po)n, n)) = an 


on a set with probability 1 — o(1). Since n^! T ((a(po)n, n)) is bounded by b, the 
conclusion follows. C 


We will adhere to the following terminology throughout. Given a path y in the 
lattice £, T (y) denotes its weight dde We, Where P[wm,. =a] = p, P[@e =b] = 
1 — p. We let P(a(po)n) denote all paths (oriented or not) y :0 > (a(po)n, n) 
in the lattice L whose weight equals the first passage time T ((a@(po)n, n)). [If 
x € R, then we adopt the convention that the path y : 0 —» (x, n) denotes the path 


586 J. E. YUKICH AND Y. ZHANG 


between 0 and (|x|, n).] If p € (pe, pol, then T(y), y € P(a(po)n), will tend to 
exceed an, since typically, under P;, the edges in y required to link 0 with points 
to the right of (a(p)n, n), for example, (a (po)n, n), will not all have weight a. 

Consider 6 :— ô(q) € (0, 1/2) with a value to be specified later. For all p € 
lg, Pol, let Py :— Pu(po, p, 9) C P(a(po)n) be the (possibly empty) subset of 
Pf (a (po)n) consisting of paths y whose weight satisfies 


(po — p» ) 
log(1/(po — p)/- 

Thus, P, X: Ø 1s the event that the first passage time T'((a@(po)n, n)) is bounded 
above by an(1 + ae pees). We will show in Proposition 4.1 below that the 
probability of Pa 4 Ø 1s exponentially small, but first we require a few more lem- 


mas. Recalling that Pe< q < po < 1 and p € [q, po], we will henceforth assume, 
without loss of generality, that g is close enough to po to guarantee that 

V 
(3.3) c—— e cond id les( 


1 
1. 
log(1/(po — p) ~ Po — ;) i 


LEMMA 3.4. Ify € Pry, then y C [—2n, 2n] x [—n, 2n]. 


T(y) < an(1 SE 





PROOF. It suffices to show that if y € Pa, then y has at most 2n edges. Since 


6 < 1/2 and sity < 1, it follows that if y € Pa, then T(y) < 2an. Since 


every edge in y has weight at least a, it follows that y has at most 2n edges. D 


Given y € P(a(po)n), an edge e := ((x1, y1), (x2, y2)) belonging to y is 
termed “repeated” if the horizontal strip R x [y1, y2] contains at least one other 
edge in y and to the left of e. Edges e € y are called "sub-optimal" if either e has 
weight b or if e is repeated. Roughly speaking, paths y € Pa cannot use many sub- 
optimal edges. Edges e := (u, v) are considered to be closed line segments in R? 
in the sense that e contains its endpoints {u} and {v}. 


LEMMA 3.5. Letv:— (min(b —a,a))7!. If y € Pa, then there are at most 


avd(po — p)*n l 


(3.4) k= k(p, po, n) :— b — p) 


sub-optimal edges in y. 


PROOF. Each sub-optimal edge in y contributes an extra cost of at least 
min(b5 —a,a). Ll 


Recalling that Pe <q < po <1 and pe [g, pol, we will henceforth assume, 
without loss of generality, that g is close enough to po to guarantee that (3.3) holds 


FIRST PASSAGE PERCOLATION 587 


and that k € [0, 45]. Given y € Pu, project all sub-optimal edges in y onto the 
x-axis. The projection forms a possibly empty collection of closed intervals on the 
x-axis which may overlap. However, when the projection is nonempty, the union 
forms a collection of cu closed intervals I4(y), (y), ..., Lj (y) called the 

x-trace r, (y) of y € Pa. The intervals in c, (y) have integral nons and belong 
to [—2n, 2n] by emm 3.4. Here j € N cannot exceed the number k of sub- 
optimal edges; if k = 0, then there is no x-trace. Note that distinct paths y € P, 
may have identical x-traces. 


DEFINITION 3.1. For all 1 < j < k, let J; denote the collection of all 
x-traces consisting of j disjoint subintervals. | 


Next, given y € Pa, remove all edges in y whose projection onto the x-axis is 
a proper subset of r,(y) (some such edges may be oriented and have weight a). 
What remains are called the optimal edges in y; such edges are necessarily ori- 
ented up edges with weight a. By definition, these edges collectively form a 
sequence of disjoint paths y1, y2, ..., each consisting of oriented edges having 
weight a. We call y1, yo, ..., “optimal paths.’ Note that optimal paths lie in 
[—2n, 2n] x [0, n]. 

Observe that the yj, i > 1, are contained in the horizontal strips R x [yi, y;], 
where y; and y; denote the y coordinates of the initial and terminal points of y;, 
respectively. | 

We project all optimal edges in y onto the (vertical) y-axis. The projection 
yields a collection of intervals 7; (y), I5(y), ..., which we call the y-trace vy (y) 
of y. Each interval in ty (y) is a subset of [0, n]. 


DEFINITION 3.2. For all 1 < j < k, let 77 denote the collection of all 
y-traces consisting of j subintervals. 


Given y € Pa, we call the set of intervals Try :— {J; (yy "s yy}, the 
xy-trace of y. The collection of xy-traces will provide a convenient combinatorial 
way to upper bound the probability that J^, Æ Ø. Since the number of optimal 
paths differs from the number of disjoint M in the x-trace by at most one, 
it follows that |j; — j2| x 1. We say that z,, is an xy-trace of cardinality j if 
Ji V h = j. Considering the three cases ji = jo, ji = jo — 1, and jo = ji — 1, we 
see that the collection of all xy-traces of cardinality j has the representation 


7; = (Ui, Da h ETF, eT?) 


U (d, ID]. 1° [Te JT; tp lj=Ø,l €T?) 


U {U ID} phe TF y eT? / p=}. 


588 J. E. YUKICH AND Y. ZHANG 


Since elements of J; and T? have integral endpoints, Lemma 3.4 implies that 


card 77 < 2): Notice that the elements of 7? have integral endpoints which 

may coincide (they coincide if there is an integer i such that y; = yj41). The 

elements of 7? can be coded by their endpoints {(y;, y; is so that, for ex- 

ample, the sequence 1, 2, 2, 5, 7, 8 denotes the following three intervals on the 

y-axis: Ij := ((0, 1), (0, 2)), L := ((0, 2), (0, 5)), B := ((0, 7), (0, 8)). Clearly, 
4 


T? < (25. Since clearly (3) < (2;) for 1 < j < k, we deduce the crude bound: 


LEMMA 3.6. For all 1 <j <k, we have card 7; < 3(9) . 


4. Lower bounds for first passage times. Recall that g and po are fixed 
scalars satisfying De< q < po. By Lemma 3.3, we have f,,.(p) — a = 0 for all 
p € [po, 1]. It remains to show that fp, — a satisfies inequality (3.1). We do this 
by showing that the first passage time T ((a(po)n, n)) is bounded below by 


§ od 
2 ( i4 (po — p) | 
log(1/(po — p)) 
with overwhelming probability for p € [q, po]. Recalling the definition of Cj in 


Proposition 2.1, we have the following: 


PROPOSITION 4.1. For all p € [q, po] and all n = 1,2,..., 
Pp[Px(po, p, 9) € €] x Cin? exp(— (po — p)’n/4C)). 


Before proving Proposition 4.1, we first show how it implies that fj, — a satis- 
fies the conditions of Lemma 3.1. We have, for all p € [q, po], 


fo (p) = lim, CU Ro (pom 2) 

> fim ing Ee LU. (e (poen, Mao] 
n—-oo " 

> abpo— Py 

i log(1/(po — p)) 


by Proposition 4.1 and since 7 ((a(po)n, n)) x bn. Since ô > 0, then together with 
Lemma 3.3, this shows that fj, — a satisfies the conditions of Lemma 3.1, con- 
cluding the proof of Theorem 1.1. 

Roughly speaking, Proposition 4.1 holds for the following reasons. If 


NN 
T ((a(po)n, n)) is small [i.e., bounded above by an(1 + ata py) then the 


shortest travel time path cannot have too many sub-optimal edges. The path to 
(a:(po)n, n) is thus nearly an oriented path with only a edges. However, with such 


FIRST PASSAGE PERCOLATION 589 


edges, an oriented path will typically only reach (a(p)n, n), where a(p) < a(po). 
The estimáte of the probability of the complement of such an event is handled by 
Proposition 2.1 and some combinatorial estimates. 

We note here that if T((a@(po)n,n)) could be bounded above by an(1 + 
pac ayy) with high probability, then our proof would show that p fp, (p) is 
not two times differentiable at p — po. We are unfortunately unable to show such 
a bound. | 

To prove Proposition 4.1, we introduce some terminology. Given / = 1,2, ..., 
say that a path y has rightward displacement of / if the difference between the 
x-components ‘of the terminal and initial points of y equals /. For all integral 
m € [n — k,n], e > 0, and p € [g, 1], let D(n,m, p, ¢) C Q denote the event that 
there exists an optimal path beginning at 0 containing m edges, and with rightward 
displacement at least (v (p) + €)n. Proposition 2.1 implies, for all p € [g, 1] and 
alne l 2 us 


P [D(n, m, p,&)] < Pp[rm = (a(p) + e)n] 
(4.1) < Cymexp(—e7m/C}) 
< Cinexp(—e°n/2C}) 
On 


since $5 X m x n. We are now ready to provide the following: 


PROOF OF PROPOSITION 4.1. Let p € [q, po] and suppose Py 4 Ø. For any 
y € Py, let dopi(y) be the total rightward displacement by the optimal edges in y. 
In other words, dopt(y) is the combined length of the projection of the optimal 
edges in y onto the x-axis. Equivalently, dopt(y) is the difference between the 
rightward displacement of y and the sum of the lengths of the intervals in the 
x-trace t, (y). For any y € Py, we clearly have doy (y) = o (po)n — k, that is, 


avó(po — p)^n | 
log(1/(po — p)) 
a(po) — JL 


day) > apo) — | 


2 
[ (aeu - «on, . avà(po — pn | 
2 log(1/(po — p)) J 


By Lemma 3.2, the term inside the braces exceeds n(po — p)(1— ere), 
which by (3.3) is nonnegative. Therefore, for all y € Py, 
apo) — a(p) 
2 


Let P, denote all (not necessarily oriented) paths in the lattice £ beginning at 0 
and ending at a point (m, n), m € N, with an xy-trace having cardinality at most k. 


> a(p)n + ( 


dag (y) = (pn + ( )n Oe ae 


590 J. E. YUKICH AND Y. ZHANG 
We thus have 


PiPPa Z Ø] < P [ay € P : dopi(y ) = a(p)n + (po — p)n] 
= P,[3y € $5, :dop(Y) = a(p)n + (po — pn, try (y) = Ø] 


k 
+ > P,3y € P : dop (Y) = æ(p)n + (po — p)n, Txy(y) € Ty], 
j=l 
since P, is the disjoint union (over T in J; and j € {1,2,...,k}) of paths in £ 


beginning at 0 and having an xy-trace T for some T € 7; and some 1 x j x k. By 
additivity, the above equals 


(4.2) Pp[3 YEP. ni dog(Y) = > a(p)n + (po — p)n, Txy(v) = e] 
k 


Y Y. Pay € £/:d(y) = a(p)n + (po — pn, (y) = T]. 
jelTe3Tj 


Consider a fixed xy-trace T € J}. ES such trace T is uniquely defined by a 
set of deterministic points ((P;, Pry! 1, where (Pj, P) € £,1 <i x 2j, are the 
endpoints of j optimal paths. 

By independence and invariance by translation, the probability that there exists 
an optimal path between (P;, Pj) and (P2, P5) and a second optimal path between 
(P5, P5) and (P4, P4) equals the probability that there exists an optimal path join- 
ing 0, the point (P? — P1, P5 — Pj) and the point 


((P5 — P1) + (P4 — P3), (P5 — Pj) + (P4 — P3)). 


More generally, the probability that there exist optimal paths joining (P;, P;) 
and (Pj41, P;,,), for all 1 <i € 2j — 1, is Dra by the -a that there 
exists an optimal path between 0 and yarn (Pam Bi, 2 PL P; 4l — Pj). 
Any such path has a total of N :— a Deum “I E P7) edges, where N e [n — 
k,n — 1]. Thus, for each 1 < j < k, enr bi T € Jj, each summand in (4.2) 
is bounded by the probability that there is an optimal path with N edges with 
rightward displacement at least a(p)n + (po — p)n, that is, by the probability 
of D(n, N, p, po — p). Similarly, the first probability in (4.2) is bounded by the 
probability of D(n,n, p, po — p). It follows by Lemma 3.6 and (4.1) that (4.2) 
becomes 


(po — =) 
2C 


k 2 apin 
rend (5) (r) 


Ppl Pn 5592] x Cin ex(- 
(4.3) 


FIRST PASSAGE PERCOLATION 591 


To conclude the proof of Proposition 4.1, it suffices to show that, for all 
l<jsk, 


4n\* (po — PP) 
4.4 .} < —— ]}, 
e (25) <exp( AC, 
To do this, we will make use of ([7], Corollary 2.6.2) 


(*) <exp(ut(2)), even 


where, for all x € (0, 1), 
H (x) := —xlogx — (1 — x)log(1 — x). 


Thus, for all j = 1,2,...,k:— Lavd(po — pYn/ log) ], we have 


e» 0 (Se (een) 


where the first inequality holds since k < n/10. 
There is xo € (0,1) such that if x € (0, xo), then —(1 — x)log(1 — x) < 
—]og(1 — x) € —x log x, showing that, for all x € (0, xo), we have 


1 
A(x) x 2xlog -. 
x 


By choosing 6 :— 6(g) so small that avd < xo, we guarantee that k/2n < xo. 
Since x log i is increasing on (0, 1), we obtain 


a( k ) avé(po — p)* oe( ee ) 
2n) ~ log(1/(po — p)) |avó(po — p)*n/log(1/(po — p))] 
— avé(po — py (teed (po — 2 
T log(1/(po — p) avó(po — p)? / 


since DI < a for x, y > 1. Simple algebra shows that the above equals 


avó(po — p)? | ] 4 ) ] )| 
soe] loe — | --2 
log(1/(po — p)) is o ( +log(-4 = oe(—— 


4 
< 3avd(po — p) J- avé(po — py’ los(—) 








using —oo < loglog: < logt for t > 1 and log, —) > 1. Choosing ô := ó(g) € 
(0, 1/2) so small that avó log(25.) < (avd)'/?, we get 


k 
(4.6) H(—) < 4(av8)"/2(pp — p}. 


592 


J. E. YUKICH AND Y. ZHANG 


Substituting (4.6) into (4.5) and squaring, we obtain, for all 1 < j <k, 


MT us (32(av8)!? (pg — pY^n) 
2j) <exP po — py n). 


Recalling that Cı depends only on 4, we may choose 6 := ó(q) > 0 even smaller if 
necessary to ensure that 32(av8)!/? < 1 /4C1, thus, showing (4.4). Proposition 4.1 
follows. L] 


Acknowledgments. The authors gratefully acknowledge detailed referee 
comments, which resulted in an improved exposition. They also acknowledge the 
hospitality of Lehigh University and the University of Colorado, where parts of 
this research were done. 


[1] 


[2] 
[3] 
[4] 


REFERENCES 


CAMPANINO, M., CHAYES, J. and CHAYES, L. (1991). Gaussian fluctuations of connectiv- 
ities in subcritical regime of percolation. Probab. Theory Related Fields 88 269—341. 
MR1100895 

CHAYES, J., CHAYES, L. and DURRETT, R. (1986). Critical behavior of the two-dimensional 
first passage time. J. Statist. Phys. 45 933-948. MR0881316 

Cox, T. and DURRETT, R. (1981). Some limit theorems for percolation processes with neces- 
sary and sufficient conditions. Ann. Probab. 9 583-603. MR0624685 

Cox, T. and KESTEN, H. (1981). On the continuity of the time constant of first-passage per- 
colation. J. Appl. Probab. 18 809-819. MR0633228 


[5] DURRETT, R. (1984). Oriented percolation in two dimensions. Ann. Probab. 12 999-1040. 
MR0757768 
[6] DURRETT, R. and LIGGETT, T. (1981). The shape of the limit set in Richardson's growth 
model. Ann. Probab. 9 186-193. MR0606981 
[7] ENGLE, E. (1997). Sperner Theory. Cambridge Univ. Press. MR1429390 
[8] HAMMERSLEY, J. M. and WELSH, D. J. A. (1965). First-passage percolation, subadditive 
processes, stochastic networks, and generalized renewal theory. In Bernoulli, Bayes, 
Laplace Anniversary Volume (J. Neyman and L. LeCam, eds.) 61—110. Springer, Berlin. 
MR0198576 
[9] KESTEN, H. (1986). Aspects of first passage percolation. École d'été de Probabilités de Saint- 
Flour XIV. Lecture Notes in Math. 1180 125—264. Springer, Berlin. MR0876084 
[10] KUCZEK, T. (1989). The central limit theorem for the right edge of supercritical oriented per- 
colation. Ann. Probab. 17 1322-1332. MR1048929 
[11] KUCZEK, T. and CRANK, K. (1991). A large deviation result for regenerative processes. J. The- 
oret. Probab. 4 551—560. MR1115162 
[12] MARCHAND, R. (2002). Strict inequalities for the time constant in first passage percolation. 
Ann. Appl. Probab. 12 1001-1038. MR1925450 
[13] ZHANG, Y. (2004). On the infinite differentiability of the right edge in the supercritical oriented 
percolation. Stochastic Process. Appl. 114 279-286. MR2101245 
DEPARTMENT OF MATHEMATICS DEPARTMENT OF MATHEMATICS 
LEHIGH UNIVERSITY UNIVERSITY OF COLORADO 
BETHLEHEM, PENNSYLVANIA 18015 COLORADO SPRINGS, COLORADO 80933 
USA USA 
E-MAIL: joseph.yukich Glehigh.edu E-MAIL: yzhang @ math.uccs.edu 
URL: www.lehigh.edu/~jey0/jey0.html 


The Annals of Probability 

2006, Vol. 34, No. 2, 593-637 

DOE: 10.1214/009117905000000693 

Q Institute of Mathematical Statistics, 2006 


GREEDY LATTICE ANIMALS: GEOMETRY AND CRITICALITY! 


BY ALAN HAMMOND 
University of California—Berkeley 


Assign to each site of the integer lattice Z7 a real score, sampled accord- 
ing to the same distribution F, independently of the choices made at all other 
sites. A lattice animal is a finite connected set of sites, with its weight being 
the sum of the scores at its sites. Let Nj, be the maximal weight of those lat- 
tice animals of size n that contain the origin. Denote by N the almost sure 
finite constant limit of n^! N4, which exists under a mild condition on the 
positive tail of F. We study certain geometrical aspects of the lattice animal 
with maximal weight among those contained in an n-box where n is large, 
both in the supercritical phase where N > 0, and in the critical case where 
N =0. 


1. Introduction. In this paper a lattice animal is a connected set & of sites 
in Z4, where d > 2. To each £ in the set A of all lattice animals assign the random 
weight S(€) = diveé Xy, where {Xy:v € Z^) are independent random variables, 
each having a common distribution F. With |§| denoting the number of sites in 
a set £ C ZH, a greedy lattice animal of size n is a set € € A that contains the 
origin 0 with |5| = n and whose weight N, is maximal among all such sets. The 
study of greedy lattice animals was begun by Cox, Gandolfi, Griffin and Kesten [2] 
and Gandolfi and Kesten [5]. Cox, Gandolfi, Griffin and. Kesten [2] present some 
optimization problems that motivate the definition of greedy lattice animals. 

It is shown in Theorem 1 in [5] that n^! N, converges almost surely and in L! to 
a nonrandom finite constant N, in the case that the quantities Xy are nonnegative 
and 


XO 
(1.1) [i x'(logxY dF(x) «oo  forsomec >d. 
I 


The same conclusion is derived in Theorem 1.1 of [19] for nonnegative Xy under 
the slightly weaker condition that 


(1.2) Í - (1— F(x) /4 dx « oo 


[which in particular holds whenever c > d — 1 in (1.1)]. By a subtle and involved 
argument, Theorem 2.1 in [3] extends the almost sure convergence n lN, — N to 


Received November 2004; revised March 2005. 


I Supported in part by NSF Grant DMS-00-71448. 
AMS 200 subject classification. 60K35. 
Key words and phrases. Percolation, lattice animals, optimization. 


593 


594 A. HAMMOND 


any real-valued X, whose distribution satisfies (1.1). The condition (1.2) is almost 
optimal, as limsup, n^! N, = oo whenever {5° x^ dF (x) = oo (see Theorem 2.2 
of [3]). In this paper we continue the study, initiated in [3], of greedy lattice ani- 
mals whose law F may have an arbitrary negative tail. We have used (1.2) as the 
condition that we require of the positive tail. We make use of some results of [3] 
in this paper, and note that the proofs on which we rely are valid with the condi- 
tion (1.2) replacing (1.1), as is explained in the note on page 207 of [3]. 

We note in passing that a related object, the greedy lattice path of size n, in 
which the space Æ is replaced by the collection of finite self-avoiding paths, is 
studied in [5] and [19]. These papers prove that the corresponding normalized 
weights n lM, converge to a nonrandom finite constant M < N (subject to the 
same nonnegativity and tail conditions on Xy). It is further shown in [14] that 
M = N only when Xo is of bounded support and the probability that X9 equals the 
right endpoint of its support is at least the critical probability pe for site percolation 
on Z^. 

One of the central themes in the study of greedy lattice animals is the phase 
transition that the model undergoes as the constant N changes from being neg- 
ative to positive. It is true that independent site percolation, obtained by taking 
Xy, € (—oo, 1}, is excluded from the theory. (Indeed, in the supercritical phase, i.e., 
when P(Xq = 1) > pe, the limit N of n^! N, is a nondegenerate random variable 
on {—co, 1}, with {N = 1) being the event that 0 is in the infinite open percolation 
cluster.) It is helpful, however, to think of the natural objects of study in the the- 
ory of greedy lattice animals as counterparts of well-known objects in percolation. 
We will mention some of these parallels, as well as comparing and contrasting 
percolation with the current framework, throughout this Introduction. 

Let Ac denote the collection of lattice animals contained in a set C C ZZ. For 
positive integers n, let Byn = V + (0,...,n — 1]7 denote the n-box shifted by 
the vector v € ZZ, with B, = Bo n. Dembo, Gandolfi and Kesten [3] studies the 
limiting growth of the weight of the greedy lattice animal in the n-box, 


Gn = max{S(§):& € Ap}, 
and its size 
? oe :— min{|é|:& € Ag, and S(E) = G4], 


the minimum being taken to break ties in the case where F is not atomless. 

In the percolation model (X, € (—oco, 1}), there is a transition for the quantity 
Gn = Ln from O(logn) at p := P(Xy = 1) < pe to O(n^) at p > pe. This tran- 
sition is analogous to the emergence of a giant component in the random graph 
model G(n, p) for p = c/n at c = 1. It is shown in Theorems 3.1 and 3.2 of [3] 
that a similar transition occurs for any proper distribution F satisfying (1.2). That 
is, if N <0, then n~!G,, is almost surely bounded in n, whereas, in the case that 
N > 0, there exists a constant c € (0, 1) such that, almost surely, n 4 G, € (c, c^!) 
for all n large enough. 


GREEDY LATTICE ANIMALS 595 


Our main goals here are to understand more fully the transition of the weight 
and size of the greedy lattice animal in the n-box, its shape and the behavior at 
criticality, that is, when N =Q. 

- Our first result sheds some light on the geometry of the greedy lattice animal in 
a large n-box in the supercritical case. We define an £-box percolation process of 
parameter p € [0, 1] to be the random collection of disjoint £-boxes (B4, :a € P}, 
where P is the collection of open sites for an independent site percolation in Z 
where each site is open with probability p. 


THEOREM 1.1. Let F be a distribution satisfying (1.2) for which N > 0. For 
any €  O there exist C, £ € N and an £-box percolation (Bea, :a € P} of parame- 
ter at least 1 — £ such that for all n sufficiently large, each greedy lattice animal 
in the n-box By, intersects every £-box from the largest connected component of 
{Bea c:a € P, Bea e C (C£, ...,n — 1 — C£). 


Theorem 1.1 implies that L, exceeds the number of £-boxes in the largest con- 
nected component to which the theorem refers. Applying some well-known facts 
about the supercritical phase of percolation, that will later be stated in Lemma 2.7, 
the limiting fraction of the n-box occupied by the corresponding greedy lattice 
animal, | 


(1.3) L —liminfn É La, 
Hn-—oo 


is therefore bounded away from zero when N > O. In this way, the theorem re- 
moves the restriction of exponentially decaying positive tail of Xo under which 
this is proved in Theorem 4.4 of [3]. Theorem 1.1 shows how pervasive any greedy 
lattice animal in a large box must be: it reaches into all but a small fraction of 
the array of £-boxes. The limiting density lim, n^ 4 |y,| of the largest cluster yn 
in B, of a supercritical percolation is the density of the unique infinite cluster, 
8(p) = P(|C(0)| = co), by Lemma 7.89 of [6]. For this reason, the counterpart in 
the framework of greedy lattice animals of the density of the infinite cluster is the 
limiting fraction L. At least in principle, this quantity may be random. Our next 
result advances the treatment of Theorem 3.2 in [3] by resolving the corresponding 
question for the quantity G. 


THEOREM 1.2. For any distribution F that satisfies condition (1.2), there ex- 
ists a nonrandom finite constant G such that almost surely 


G= lim n~@G,. 
n-— oo 


The proof of Theorem 1.2 that we present addresses only the case when N > 0, 
because, as we shall discuss in comments after the statements of Theorem 1.3 
and Proposition 1.5, results from [3] and other arguments settle the case in which 
N <0 (the constant G then being equal to zero). 


596 A. HAMMOND 


In common with [3], the assumption that F may have an arbitrary negative tail 
has created the need for more intricate techniques in the proof of Theorem 1.1, and 
still more so in that of Theorem 1.2, than those that would work were these results 
to suppose conditions on the negative tail. We next prove a relation between L and 
the constants G and N. 


THEOREM 1.3. Jf F is a distribution satisfying condition (1.2) for which 
N > 0, then the inequality G < LN holds almost surely. 


Given Theorem 1.1, the proof of Theorem 1.3 is simple. To outline the argu- 
ment, whose details appear in Section 4, let &, be a greedy lattice animal in the 
box B, for which |&,| = Lan. If En happens to contain 0, then 


(1.4) NL, Z SEn) = Gn. 


The quantity Nz, behaves like NL, for high values of n by Theorem 2.1 of [3], 
from which the inequality G < LN follows. In general, of course, the origin may 
not lie in £j. However, it follows from Theorem 1.1 that &, reaches to within a 
distance of 0 that is bounded above, uniformly in n. This means that (1.4) holds up 
to a constant term, implying Theorem 1.3. 

Consider a distribution F for which N — 0. We may apply Theorem 1.3 in the 
supercritical case of the law F,, defined as the distribution of the random variable 
Xo +8, where Xo has the law F', and where £ > 0 is arbitrarily small. It readily fol- 
lows that G = 0 when N = 0. The authors of [3] comment that they do not address 
the critical case. The next theorem does so, providing a more precise estimate on 
the growth of G, at criticality than the statement that G = 0 in this case. 


THEOREM 1.4. Suppose F satisfies (1.2) and is such that N = 0. Then, for 
any c > d/(d — 1), we have that almost surely 


(1.5) lim (log n) n 1G, =0. 


The result stands in contrast to that valid for critical percolation in Z^, for which 
Kesten and Zhang [10] prove that the size of the largest open cluster in the box B, 
grows at a rate exceeding n'*?, for some 8 > 0. Theorem 1.4 is an optimal re- 
sult up to logarithmic corrections, because for choices of F that come close to 
violating the positive tail condition (1.2), the maximum weight of sites in B, is 
typically of power order n. Indeed, for any a > 0, F(x) = 1 — x d (log x) «d-o) 
satisfies (1.2), with 


Ne? OO 


(1.6) lim P(max Xy> nqog)! =) — 1, 
V€ B, 


where X9 has law F. For each å €e R, the random variables given by Yy(A) = 
Xyi{Xy > A) — 4 satisfy (1.6) [or, more precisely, they satisfy this condition if 


GREEDY LATTICE ANIMALS 597 


the factor of n in (1.6) is replaced by 1/2]. Provided that A is high enough, the 
corresponding value of N may be zero or even negative. Thus, the bound in The- 
orem 1.4 cannot be improved by more than a logarithmic correction. Nonetheless, 
it may be that for some less contrived choices of F for which N = 0, the growth 
rate of Gn is sublinear. 

Finally, we mention some results that arise from considering "greedy" lattice 
animals that are constrained to occupy a given fraction of the sites of a large box. 

For n € N and g c (0, 1), let 


Gy (o?) = max(S(£) :& € Ag, |El = [n^or]] 


denote the maximal weight among lattice animals of specified size that are also 
contained inside the n-box. The constant N can be obtained as the limit of the 
weight per site of low-density maximal-weight animals in a large box: 


PROPOSITION 1.5. Suppose that F satisfies (1.2). Then 
G(o) = lim n ^Gy(o) 


exists almost surely for each a € (0, 1), with G: (0, 1) — R a concave, nonrandom 
function. If further X9 is bounded below, then a Gia) > N asa | O. 


Amir Dembo proposed the method of proof of Proposition 1.5 as a means to 
derive the inequality G < LN. Given that we obtain this inequality by a different 
proof, we have omitted the proof of Proposition 1.5 from this paper. The proof 
is presented in the Appendix of [11]. The consequence that G < LN holds in 
the case where, Xo is bounded below is also proved in Corollary 1 of [11]. This 
alternative approach to proving this inequality has the minor merit that 1t works 
even in the case that N < 0, because Proposition 1.5 requires no hypothesis on 
the value of N. We know from [3] that G = 0 when N <0 and G > 0 when 
N > 0, so that Corollary 1 of [11] implies that L = 0 when N < 0 and Xy is 
bounded below. (In the case where X has exponentially decaying positive tail and 
N < 0, Theorems 4.3 and 4.4 of [3] prove that G, = O (logn) and Ly, = O (log n), 
similarly to the case of percolation.) 

If the distribution F is chosen so that N = 0, with Xo bounded below and 
the function G: (0, 1) — R strictly concave, then Proposition 1.5 implies that 
the right-derivative at 0 of the concave function G : (0,1) > R vanishes, so that 
G (0) = Sups eo. 1) G (a). It is plausible from the definition of G, and is verified dur- 
ing the course of the proof of Proposition 1.5 of [11], that the — G is equal to 
SUp, ep, 1) 6 (a). For such a law F, then, we find that G(0) = G = lim, n^ 4G, = 


lim, n4 G, (n4 L,), from which we learn that L = 0. nae we might say 
that, in many cases of a critical choice of law F, the greedy lattice animal in a 
large box comprises a negligible fraction of the sites in that box. The correspond- 
ing statement for percolation is that @(p,.) = 0, which amounts to the absence of 


598 A. HAMMOND 


any infinite cluster at the critical value. One analogue of continuity of the perco- 
lation probability is trivially false: if Xo is equal to € almost surely, then L is the 
almost sure constant 1, whatever the value of e > 0. However, the map F +> G[F] 
does not have a jump discontinuity as the law F is increased through choices for 
which N = 0. This follows from Lee [15], which proves, under uniform stochastic 
dominance and moment assumptions, that F +> N[F] is continuous with respect 
to weak convergence of measures, and the upper bound G x LN < N asserted by 
Theorem 1.3. 


The global geometry of the greedy lattice animal: two examples. "Theorem 1.1 
prompts the question: if N > 0, how closely does the geometry of a greedy lattice 
animal in a large box resemble that of the infinite component of a supercritical 
site percolation? Two examples illustrate how the answer depends on the choice 
of the distribution F. We do not rely on the assertions made about these examples 
later in the paper, and do not provide proofs of them, although these proofs are 
straightforward. 

In the first example, X, € (—4,1) with P(X9 = 1) = 1 — P(Xọ = —2)—p 
for a pair (p, A) € (0, 1) x [0, oo) for which N > 0. If the parameter p > pe is 
supercritical for site percolation, then a greedy lattice animal in B, contains, for n 
large enough, the largest connected component I of (v € B,: X, = 1). Moreover, 
a smaller cluster y of one-valued sites in B, would lie in the greedy animal by 
forming a path into I’ unless it is isolated from I' by a region of —A-valued sites 
which requires a path of length about |y|/A sites to cross. Connected sets of one- 
valued sites are therefore much more prone to be part of the greedy lattice animal 
in a large box than connected sets of open sites are liable to form part of the largest 
connected component of open sites in the same box for a supercritical percolation. 
This means that, in this case, the global geometry of a greedy lattice animal is at 
least as connected as that of the largest component of a percolation. 

This choice of law for Xo has in fact been studied previously. The behavior of 
the greedy lattice path in this model is closely related [by setting A = o/(1 + p)] 
to the model of p-percolation, introduced in [21]. Taking à > O fixed, Lee [16] 
explores the behavior of N[p] as p 4 0 and that of 1 — N[p] as p ¢ pe. 

As a contrast, consider the case where X, € {—1, A} with P(X9 = —1) = 1 — 
P(X9 = A) = p, with A high and p close to 1. Supposing that for a greedy lattice 
animal y in a large box B,, the collection y N {v € Bn : X, =A} is some given set 
$ C Bn, then y will minimize the size of the set of connection costs (v e B,: X, = 
—1} subject to joining together the sites of @ by paths in B,. The choice for the 
set $ is made by then optimizing over subsets of {v € B,: X, =A}. Thinking of 
X-valued sites as cities that benefit from travel between them but must pay some 
given cost per unit distance of connecting road (or —1-valued site), the greedy 
lattice animal is the network of cities that renders the greatest benefit over road 
cost. As such, the greedy animal in this case is closely related to the tree with 
minimal total edge length subject to the constraint that its vertices contain some 


GREEDY LATTICE ANIMALS 599 


fixed fraction of the points of a constant-rate Poisson process in a large box in R?, 
with the edges being line segments between such points. The global geometry of 
this object would seem to be starkly different from that of the largest connected 
component of a supercritical percolation in a large box, having a much higher 
graphical distance between a typical pair of distant points. 

We mention that, in the current case, we may interpret Theorems 1.4 and 1.1 
as asserting that if the benefit A per city is high enough that there exists a network 
of cities in B, whose collective benefit exceeds road cost by an amount that 1s 
superlinear in n, then the optimal network in fact comprises a positive proportion 
of all the cities in B,. 

We remark also that the greedy lattice path for this choice of law F corresponds 
to the variant of the traveling salesman problem for the Poisson process of points, 
where the salesman need only visit a high but fixed fraction of the points in a large 
box. 


Organization. In Section 2 we first define notation and prove some lemmas 
that will be of use throughout the paper, and then give the proof of Theorem 1.1. 
The key to this proof is Lemma 2.4, which shows that it is probable that large 
boxes, of sidelength £, contain weighty lattice animals that may readily be joined 
to moderately sized animals in their surroundings. Any greedy lattice animal in a 
much larger box T fails to avoid most such animals in the array of £-boxes that lie 
in I, because each of these animals is liable to increase the weight of any animal 
that runs nearby by joining up with it. 

In Section 3 we prove Theorem 1.2, beginning with an outline of the argument. 
The proof of Theorem 1.3 is given in Section 4. 

In Section 5 we prove Theorem 1.4 by showing that the negation of (1.5) implies 
that N > 0: animals that witness the violation of the bound (1.5) at finite values of 
n are concatenated by reasonably short paths the weight of whose sites is uniformly 
bounded below. All sufficiently large boxes are therefore highly likely to contain 
animals whose weight is a high multiple of the sidelength of the box. A further 
argument which involves concatenating these animals establishes the conclusion 
that N > 0. 


2. The geometry of the maximal weight animal: proof of Theorem 1.1. 
We shall examine in this section some aspects of the geometry of the greedy lat- 
tice animals in the n-box B, for large n, in the case where N > 0. We show be- 
low that such animals intersect well the largest connected component in the n-box 
for supercritical £-box percolation, provided that £ is some fixed large value and 
n > n(£) is high enough. The proof of Theorem 1.1, in common with several later 
proofs, will require numerous definitions and lemmas, which we now state and 
prove. 


600 A. HAMMOND 


DEFINITION 2.1. Let Gg and Lg denote the weight and size of a greedy 
lattice animal of minimal size in a given m-box B = By ,,. For A € R, say that a site 
v € Z4 is A-white (or white, for short) if X, > —A, and is black otherwise. The set 
of A-white sites is a percolation: we will define numerous site percolations on Z4, 
so that each percolation process is an independent site percolation, unless stated 
otherwise, and so that pe = p-(d) denotes the critical value for site percolation 
in Z7. The minimal length among all A-white paths in Z? from some white site 
in B to some white site in A is denoted by D(B, A) [or by D(v, A) or D(v, u), in 
the case that B = (v) and possibly A = {u}]. We write B «» A (in C) in case such a 
path (in C) exists. We also write £55 (v, A) = infyea ||v — ull for the minimal sup- 
norm distance from v to a site in A. Further, for an m-box B = By ,, and for any 
q € N, set 


BÍq]— U DBy-ama,m ; 
Oxlall xq 


noting that B C B[q]. 


DEFINITION 2.2. The boundary of A C Z4, denoted 8A, is the collection of 
sites u ¢ A adjacent in Z? to some v € A. For B C ZZ, the B-boundary ðgA of 
A C B is given by B N 3A. The white cluster W(v) of v € Z is the collection 
of sites u such that u <> v [in particular W(v) is empty in case v is black]. In an 
analogous manner, we define the black cluster of v, denoted by B(v). 

Let £ be the graph with vertex set Z? and with an edge between any pair 
u Æ v € ZÓ with |u—v|| = maxj<j<gq |u(i) — v(i)| = 1. We use the notation W.e(v) 
and 8 ,(v) to denote the white and black clusters of the site v with respect to the 
graph £L. For v € Ba, we write 'W,(v) or 8, (v) for the white or black clusters 
of v in the graph Bn, and Wya, r(v) or Bn, (Vv) for these clusters in the induced 
subgraph of £ with vertex set B,. By "path" or “.£-path,’ we mean a finite self- 
avoiding path in the nearest-neighbor or £-topology. We will occasionally write 
“gd -path" to emphasize that the nearest-neighbor topology is being used. 

The set C separates A, B C Z4 (in D) if any path (in D) from A to B inter- 
sects C. If each such path intersects C at a location not lying in A U B, then we 
say that C properly separates A and B. The set C separates A from infinity if any 
infinite path from A intersects C. 

For C € ZZ, the exterior boundary of C is given by 


dext(C) = (v € Z : v is adjacent in £ to some w € C, 
(2.1) 
Ja path from oo to v disjoint from C]. 


For C € B, and x € B, \ C, the boundary of C visible from x in B, is given by 
Ovis(x),n(C) = {v € Bn : v is adjacent in £ to some w € C, 


Q.2) 


Ja path in B, from x to v disjoint from C]. 


GREEDY LATTICE ANIMALS 601 


LEMMA 2.1. fcc ZÀ is finite and L-connected, then dex(C) is 
Z4 -connected. If C € B, is £-connected, with x € B, and x ¢ C, then dyis¢x) n(C) 
is Z4-connected in Bn. 


PROOF. ‘The first part of the lemma is Lemma 2.23 of [9]. We prove the second 
part by altering Kesten’s proof. The topological setting is that of a closed d-ball 
instead of RZ, making the changes more than merely notational. We write 


U = (x e Rf: |x| < 1/2, i € (1,...,d)] 
and 
U* = {x eR? :|x(i)| < 1/24- & i € {1,..., d}} 


for some e € (0, 1/8], and we set N = Uyec(v + U*) n [0, n — 1]?. Kesten first 
proves, by invoking Alexander and Poincaré duality, that, if N” C IR? is bounded 
and connected with its topological boundary dN’ a topological (d — 1)-manifold, 
then the set ` 


Oe (N^) := (y € AN’: da continuous path p: (0, o0) > IRZ X N', 
(2.3) 


p(t) ^ y as-t > 0, |p@)| — œ as t > oo] 


is path connected. We define for any sets F, O C RÉ such that O N F is open in 
the subspace topology in F, and for each point x € F X clos(O), 


Ovis), F(O) = ly € F 1890 :3continuous path p:[0, T] > F: 
p(0) =x, p(T) = y, plo, T]NO = 2}. 


[Note that the use of the 0.x; and Ayvis notation causes no conflict with the discrete 
case (2.1) or (2.2), because we are considering O C IRZ.] The analogue of the path 
connectedness of the set (2.3) that we require is that, for each x € [0, n — 1]? \ 
clos(V), 


(2.4) the set Oyis¢x) (9,5114 ON) is [0, n — 1]¢-path connected. 


To establish (2.4), let x € [0, n — 1]Z \ clos(N) be fixed. We assume in the first 
instance that N N 8([0, n — 1]2) Æ Ø, and will prove (2.4) in this instance by re- 
ducing to the Rf case by a doubling trick. We let closed d-balls B4(1) and B4 (2) 
denote two homeomorphic images of [0, n — 1], writing x and N for the images 
in B4(1) of x and N by a harmless abuse of notation. We glue the two balls by 
identifying their boundaries to form T = B4(1) U sa-:B4 (2), which is a homeo- 
morphic image of the sphere $7. Let 6: — T map each point in By(i) to the 
corresponding one in B4(3 — i) for i € (1, 2} [so that ¢ fixes the common bound- 
ary of Ba(1) and B4(2)], and let db >I’ — I be given by 


fi 2)» if y € B4(1), 
$0 = [5 if y ¢ By(1). 


602 l A. HAMMOND 


Define Ñ = N Uz @(N), where Z = N N ðBa(1) is the part of the boundary 
of B4(1) along which N and its mirror (N) are glued. If x: \ {x} RÀ is 
a homeomorphism, then dyis(x),r (Ñ) = x7! 8e (x (N))). We now check the con- 
ditions on x (N ) that permit us to apply = 3) with the choice N’ =x (N ). Note that 
X (N ) € RZ is bounded, because x £ clos(N ), and that x (N ) is connected because 
the fact that N N 0[0, n — 1] Æ Ø implies that N is connected. We must also show 
that Opa (x (N )) is a topological (d — 1)-manifold. Kesten proved that 


(2.5) Opd U (v+ U*) is a topological (d — 1)-manifold. 


veC 


Recalling that we consider N as a subset of B,(1), it follows easily from the defi- 
nition of N that Z = N N ð B4(1) is a submanifold of 9 B4;(1) with boundary. This 
is a sufficient condition for the doubled object N to bea topological d-manifold 
and furthermore, for its boundary in I to be given by 


(2.6) ar (N) = (dsa N \ Z) Vag nz (98,00 (N) \ Z), 


where Z = Z V ðZ, with 3Z denoting the boundary in 8 B4(1) of Z [in (2.6), Z is 
resp. embedded in B4(1) or Bg(2)]. See Chapter 8.2 of [12] for a discussion of 
gluing of such manifolds. We must now check that the right-hand side of (2.6) 
is a (d — 1)-manifold. We will do so by applying the same sufficient condition 
already mentioned, which in this case means that 0Z is a (d — 2)-manifold. We 
show in fact that each of the finitely many connected components of 0Z is a 
(d — 2)-manifold. To this end, let Z' denote one of these components. Note that 
U N Z’ is homeomorphic to an open subset of a set of the form appearing in (2.5), 
with d replaced by d — 1, provided that U is an open set lying in a part of 0 Ba(1) 
corresponding to an open face of [0, n — 1]2. If U is a neighborhood of a point in 
the boundary of at least two such faces, then an initial homeomorphism is required 
to flatten 3 B4 (1) N U. The image of U N Z’ under this map is then homeomor- 
phic to the same type of set as in the other case. We thus learn from (2.5) (applied 
with d replaced by d — 1) that Z' is indeed a manifold. We find that the glued 
object 9r (NV) is a (d — 1)-manifold, so that 9g« (x (Ñ)) is, as well. 

We may apply (2.3) to x (N ) as we sought, learning by doing so that dext(x (N ) 
is path connected, so that Ovis(x), r (Ñ ) is l'-path connected. We now use this 
fact to verify that the set in (2.4) is [0, n — 1]Z-path connected. To this end, 
let y, z € visc, By) (N). As N € B4(1), we may consider y and z as mem- 
bers of 9viscy,r OV ). We use the I'-path connectedness of this last set to find a 
path p:[0, 1] — 9visco,r (V) such that p(0) = y and p(1) = z. We claim that 
the path o p:[0, 1] > Bg(1) satisfies $ o p[0, 1] C 9yisc), 5,0) (N). Indeed, to 
any t € [0, 1], let g:[0, 1] — T denote a continuous map satisfying q(0) = x, 

q(1)— pt) and g[0, 1] € TAA [such a map existing because p(t) € Ovis(x), ra )l. 
The map db oq :[0, 1] — Bgz(1)\N demonstrates that ó o p(t) € dvis(x), B4 (1) (N ) for 


GREEDY LATTICE ANIMALS 603 


each t € [0, 1]. Since y = ($ o p)(0) and z = (ġ o p)(1), we have proved that the 
set in (2.4) is [0, n — 1]¢-path connected, in the case that N N 3([0, n — 11^) # Ø. 

If N NaO, n — 1]. = 2, let Y denote the topological space formed from 
[0, n — 1]£ by identifying all points in 3([0, n — 1]^). Define a homeomorphism 
E: Y \ {x} > R^, and note that we may apply (2.3) with the choice N’ = £(N) by 
a similar argument to that which permitted the choice N' — x (N). We learn that, 
in this case also, the set dyis¢x),t0,n—114 (N) = E7! (8e (E (N))) is [0, n — 1]^-path 
connected, as we sought. 

Second, Kesten proves that if v’, v” € Z7 are connected by a path $:[0, 1] > 
IR? X N, then v’ and v" may be connected by a ZZ-path that is disjoint from N, 
and intersects only such cubes v + U, v € Z4, that also contain a point of $. Cor- 
respondingly, our pair of sites v^, v" lie in [0, n — 1], and the path $ that connects 
them in [0, n — 1]? \ N, and we require that the 74 -path lies in [O, n — 114 X N. For 
this, the path produced in the R2-case does the job. 

There are two more steps in Kesten's argument. The use of [0,n — 1]? in 
place of IR makes no difference to the proofs of these steps, and we only state 
the form they take in our case. The third step asserts that, for any x € B, X C, 
and for each v € By, we have that v € Oyis(,5, (C) if and only if v ¢ C 
and (v + U) N ys, ro, s 14 (N) # Ø. The fourth step claims that each pair 
v, V" € Ovis(x),B,(C) can be connected by a path in [0,5 — 1]? which lies in 
(v --U) U(v" + U)U sso t0, 5 1 (N). Similarly to Kesten, this last path may be 
deformed by the procedure in the second step into a path in B, that contains only 
sites of Oyis(x), p, (C). Thus, yis), p, (C) is connected in B,, as required. |] 


LEMMA 2.2. Suppose that the connected sets C, D C B, are disjoint, and 
that E C B, separates C and D in By, and is disjoint from D. Then there exists 
an £-connected subset of E that also separates C and D in Bp. Suppose instead 
that C, D and E are subsets of Zê that satisfy the same conditions of disjointness, 
and that C and D are connected. If E separates C and D, then, similarly, an 
&£-connected subset of E separates C and D. 


PROOF. We comment briefly on the content of what is asserted in the lemma, 
before proving it. In the case of the first part of the lemma, set 


(2.7) C= y {x € B,:da path in E^ from x to v) 
veCAE 

and 

(2.8) Ĉ = (Cn E)U8g, C. 


Note that C C E, and that C separates C and D in Bp: indeed, the first element 
of E encountered in any path in B, from C to D lies in C. From this fact and the 


604 A. HAMMOND 


disjointness of D and E, it is straightforward that, for any y € D, the set 


{ve C:vis Z4-adjacent to some w € B4 Ç. 
(2.9) z 
da path in B, from y to w disjoint from C] 


separates C and D in B,. A variant of the second part of Lemma 2.1 asserting 
that the set (2.9) is £-connected would therefore suffice for our purpose. We men- 
tion that a sketch of an elementary proof of a similar assertion appears in the Ap- 
pendix of [20]. The variant might also be obtained by re-examining the proof of 
Lemma 2.23 of [9]. Instead of doing this, we will prove the first assertion of the 
lemma by finding an £-connected component of C that separates C and D in Bn, 
making direct use of the second part of Lemma 2.1 in the course of the argument. 
Let (54, ..., yx] denote the £-connected components of C, and let 


Qj = {v € Bn : y; separates v from D in Ba}. 
We claim that 
(2.10) hi No; # Ø implies that either 9; C $;, or 9; € $i. 


For x € $j N 6j, we consider the case that there exists a path t in B, from x to D 
that intersects y; before it intersects y;. We will show that $; € $;, and, to this 
end, we let y € ¢; be arbitrary. Let (X = Xo, X1, ..., X7, ) denote the segment of the 
path t until its first intersection with y; (so that x,, € yj). Let (y = Yo, yt... Yra) 
denote any path in B, from y to D. There exists r3 € (0, ..., 72} for which yj, € yi, 
since y € $j. Note that 


(2.11) (ve Bailo, y) x1)nyj 29 — fojzi, 


because the sets y; are £-connected components of a larger set. There exists 
an £-path from x, to y, in y;. Any consecutive members u and v of this 
path each lie in a unit cube contained in B,. We may find a Z^-path in this 
cube from u to v. In this way, we may find a path (Xn = 21,...,Z;, = yr) 
in B, such that £o5(zji, yj) < 1 for l € {1,...,74}, and thus, by (2.11), such that 
zi é yj for these / and for j Æ i. Note that the path (X = Xo, X1, ..., X} = 
Z1,..., Zr, = Yr. Yr; = y) connects x to D in Ba. Note also that its initial 
segment (Xo, X1, ..., X14, Z2, ..., Zr4) is disjoint from y; by construction. The fact 
that x € $; implies that y, € y; for some r € {r3 +1, ..., r2}. The site y, lies in the 
path (yo, ..., yr), which was chosen to be an arbitrary path from y to D. Thus, 
y € oj, and $; G ¢;, as we claimed. The conclusion $; C @; arises in the other 
case, where there exists a path from x to D that intersects y; before it intersects yj. 
This establishes (2.10). 

Note that (2.10) implies that to each i € {1,...,k}, there is a unique j € 
(1, ..., k} such that $; € $; and for which $; € à; = > j =I. Let (ji, .... Js] 


GREEDY LATTICE ANIMALS 605 


denote the collection of j € {1,...,k} arising in this way for some i € (1, ..., k}. 
Note that, by (2.10), 9j; O; = 9 fori, I € (1,..., s) with i #1. We claim that 


(2.12) Cc Jo;. 


To derive this, suppose that x ¢ U;_, $j; = Wm Qj for some x € Bn. Let T = 
(X = Vo, ..., Vrs) denote an arbitrary path in B, from x to D. IETAC= @, then 
x € C, because C separates C and D in B,,. We may assume then that T N C FØ, 
and that T intersects yı before it intersects any y; for i € {2,..., k}. In this case, 
let 0 < rg < r7 < rs be such that v,, is the first visit of T to y1, while v,, is the last 
such. (That rg > 0 follows from x é ġ1, and thus x £-y1.) We next show that 


(2.13) Vrg—1s Vrg4 € (J 9visqpy,n 1); 
yeD 


first, each of these two sites is adjacent to a member of yı. Note also that 
(V,741,---, Vrs) 1s a path in Bn V y; from v,,41 to D. Since x ¢ $4, we may find a 
path in B, from x to some site y € D that is disjoint from yı. Concatenating this 
path to (V4, 1, Vrg—2. ..., V1) forms a path in B, from v,,- to y that is disjoint 
from y1. Thus, V1 € Ovis(y,n (Y1), and (2.13). 

Note that ðvis(y)},n (y1) is independent of y € D because the connected set D 
satisfies DM E = Ø and thus DN yı = Ø (because y; € Ĉ C E). By the second 
assertion of Lemma 2.1, we may choose a path (v,,; = Uo, Ui, ..., Ur, = Vy7+1) 
that lies in 9yis(y), (v1) for any given y € D. It follows from 0yisyj,n (y1) € {V € 
Zf : £o (v, y1) < 1) and (2.11) that 


k 
[uo, ...,u4] n Jv 2 e 
i=2 


We alter the path T to form 
(XS ea ea = s Ho Nera Vos): 


This new path reaches D from x, is disjoint from yı and intersects U yi only 
at points where the path T does. By performing alterations that similarly remove 
the intersections of the new path with the other sets y;, we produce a path in B, 
from x to D that is disjoint from J$; y; = C. Since C separates C and D as noted 
after (2.8), x ¢ C. We have proved (2.12). 

We now claim that 


(2.14) there exists i € (1,...,5] for which C € $j.. 


Were this not the case, there would exist by (2.12) adjacent sites w1, w2 € C 
such that w; € $j;. and w2 € O iin for indices i1, i? € (1,...,5) satisfying i; Æ i2. 
By (2.11), one of w, € Vii, and w2 € Vii, fails. We may assume that w; ¢ Vii: The 


606 A. HAMMOND 


sets @;, being disjoint, w2 € $ ji, implies that there exists a path o from wz to D 
that is disjoint from Vii, - The fact that w; ¢ yj, implies that the path formed by 
prefixing w; to ø reaches D from w; and is disjoint from y; . We have reached 
the contradiction that w; ¢ $ ji and have proved (2.14). From (2.14) follows the 
first statement of the lemma, because Yj; 1s £-connected. 

The second assertion has the same proof, with the first part of Lemma 2.1 being 
applied, instead of the second part. The notational changes consist of omitting 
each reference to “in Ba,” including in the term 3yisçy) (-) (which is independent of 
yeD). U 


LEMMA 2.3. Let B C ZË denote the collection of sites that a finite set A C Ze 
separates from infinity. Then 
JA] > [B]! /4, 
Let C, D, E C By, with C and D connected, and C N D = Ø. Suppose that E 
properly separates C and D in B,. Then 


Ls = = 
IE| > = min(IC| ^, |J 7^). 


PROOF. To prove the first part of the lemma, note that, for each i € (1,...,d) 
and x € B, [x - nej :n € Z}NAF Ø, where (ej :i € {1, ... , d}} denote unit vectors 
in the directions of the coordinate axes of IR. It follows that 


(2.15) |A| z max __ [proj;(B)I, 
ie(1,...,d) 


where 
proj, (B) = ((a1,...,a4—1)) € Z^! :3x € Z such that 
(a1,...,8j—1, X, 8j, ..., ag_1) € B]. 
By the Loomis-Whitney inequality [18], 
(2.16) max Jjproj;(B)] > |B|", 
ie(1,...,d) 

so that we obtain the first part of the lemma from (2.15) and (2.16). 

To begin the proof of the second part of the lemma, note that the first part of 
Lemma 2.2 permits us to assume that E is £-connected. We may also assume that 


E N (C U D) = Ø, because E \ (C U D) properly separates C and D. Recalling 
the definition (2.7), and defining D analogously, note that 


(2.17) CnDag. 


This is because, otherwise, we might construct a path in E* from C to D. 
By (2.17), at least one of the inequalities [C] < In^ and |D| < In^ holds. We 


GREEDY LATTICE ANIMALS 607 


suppose the former for the time being. It follows from Theorem 19 of [13] that, for 
any A C B, for which |A| < nf /2, we have that 


1 
2 Op. A| > —|A[ V6, 
(2.18) lan, A| > 7:1Al 


Note that 9p, C C E. Indeed, it was noted after (2.8) that C C E, and og, C=Cin 
the current case, because CM E = Ø. We find that 


El > [on C] 2 2-181 E zic- ue. 


where (2.18) with the choice A — C was mi. in the second inequality, this 
choice being valid because |C| < int. The third inequality follows from C C C, 
which is implied by C N E = Ø. 

In the case where |D| < < ina, we deduce that | E| > + -IDT 1/4 "This completes 
the proof of the second part of the lemma. |] 


Having assembled these preliminaries, we now state and prove a lemma that is 
central to the proof of Theorem 1.1. 


LEMMA 2.4. For given constants c > 0 and X, p < co, we say that an m-box 
B = By» satisfies condition Aj c Uf there exists y* € Ap such that S(y*) = 
Gg > cma, |y*| > (logm)? + 1, and D(y, y*) < pm for all y € Agp such 
that |y | > (logm)^. We set qm,i,c,p :— P(B satisfies condition AÑ c)» which is in- 
dependent of the choice of the m-box B. Suppose that the distribution F sat- 
isfies (1.2) and is such that N > 0. Then, for à, p < œ sufficiently large and 
c € (0, liminfn 4 G,), we have that 


lim E 
mA o dm,A,c,p 


REMARK. Theorem 3.2 of [3] states in the case that N > 0, there exists a 
constant c > 0 such that liminfn 4 G, > c > 0 almost surely. 


PROOF OF LEMMA 2.4. Fixing € > 0, (1.2) implies that fy” x d F(x) < oo 
(this can be derived directly or by contrasting Theorem 2.2 of [3] with the note 
in page 207 of that paper). Following the proof of (3.17) in (2], this implies that 
X, > |vl| for at most finitely many v € Z? almost surely, which implies, in view 
of N > 0 and Theorem 3.2 of [3], that, for some c; > 0, 


(2.19) Lm cum^  — forall sufficiently high m. 
Thus, for all m large enough, 


(2.20) P(Lp>cym?')>1-e 


for any m-box B. 


608 A. HAMMOND 


For À such that 
(2.21) p := P(Xo > —A) > max{ pe, 1 — p-(Z7, L£)}, 


let W denote the unique infinite cluster of A-white sites in Z? (which exists by 
Theorem 8.1 of [6], since the A-white sites form a supercritical percolation). Note 
that this choice of A ensures that the process of black .£-clusters is subcritical. 
Increasing A as needed, we next show that for any p > d/(d — 1), all m large 
enough and any m-box B, 


(2.22) PA y € Agi) such that |y] > (logm)? and y N W = Ø) <e. 


For such a y as in (2.22), there exists, by choosing C — y, D — W and E — 
{black sites) in the second part of Lemma 2.2, an £-connected set y of black sites 
that separates y and W. Applying the first part of Lemma 2.3, we find that 


(2.23) P| > dogm)^0 71/2. 


because y separates y from oo. We set A = A, according to A = inf(g > 2:9 N 
B[q] 4 2). Note that if A > 2, then B[A — 1] is separated from W, and thus 
from oo, by y: indeed, a path from B[A — 1] to W disjoint from ? could be 
extended in B[A — 1] to such a path from y because B[A — 1] Ny = Ø. By 
Lemma 2.3 again, | 


(2.24) (A =q} E {IPl>((2q—1)m)*"}  fogz3. 
since |B[g — 1]| = IQq — 1)m]Ż. Thus, 
IP(3y € Apg such that |y] > dogm)? and y N W = Ø) 
< (5m)*P(|Br(0)| > (logm)^ 01/2) 


(2.25) + Y [q+ Dm PBE] = Qq — 1) m7) 
q=3 


< (5m)! exp{—cz(log m)? 0718) 
oo 
t Y^ [Qa + Dm! exp(—2Qq — 17^ m^ 1j, 
g=3 


for all m and any m-box B. The first term after the first inequality in (2.25) 
corresponds to the case where A = 2, in which case, y N B[2] zz Ø, and thus, 
|8.c (x)| > (logm)e“-!/4) for some x e B[2] by (2.23). The term indexed by q in 
the sum after the first inequality corresponds to the case where A = q, with (2.24) 
being used in place of (2.23). That the constant c» is positive follows from the 
Aizenman-Newman theorem in Section 2.4.2 of [7], which proves an exponential 
rate of decay for the probability of a large subcritical cluster containing a given 


GREEDY LATTICE ANIMALS 609 


site in any homogeneous lattice where each vertex has finite degree, including £. 
From (2.25), we obtain (2.22) for all but finitely many m. 

Taking y* to be a greedy lattice animal of minimal size in B, we find from 
(2.20) and (2.22) that 


P({y* N W £2} 


(2.26) 
N {if y € Apr; satisfies |y| > (logm)^, then y N W z Ø}) 2 1— 2e, 


for all m large enough and any m-box B. 
We apply Lemma 2.14 of [3], which is a variant of a result in [1], to the super- 


critical percolation of A-white sites. Using a union bound, we find that there exists 
p = p(à, d) > 1 and c3 > 0 such that, for all m and any m-box B, 


(2.27) PP(3v, w € B[2]: v <> wand D(v, w) > pm) x e ^". 
By (2.26) and (2.27), 
(2.28) . P(V y € Apyiz) such that |y| > dogm)” : D(y, y") < pm) > ]—3e, 


for all m large enough and any m-box B. 

By the definition of c, we have that P(S(v*) = Gg => cm?) > 1 — e for all m 
large enough and any m-box B. By (2.20), we have that P(|y*| = Lg > dogm)? + 
1) > 1 — e for such boxes Bm. So, in view of the assertion (2.28), the proof of the 
Jemma is complete. [|] 


PROOF OF THEOREM 1.1. By Lemma 2.4, we may and shall set A < oo, 
2 < p < coo and c > O such that limm-— oo Gm,a,c,p = 1. 


DEFINITION 2.3. For £ € N, we say that a € Zf is £-active if the -box Bea ¢ 
satisfies the condition AY xd defined in Lemma 2.4. 


DEFINITION 2.4. A random process t taking values in subsets of Z? is said 
to be a p-near percolation of parameter p € (0, 1) provided that for any x € ZZ, 
P(x € t) = p, and for any x, y € Z? for which ||x — y|| > p, the events {x € t} and 
ly € t} are independent. 


LEMMA 2.5. For any £ € N, the collection of £-active sites forms a 
(2p + 1)-near percolation. 

PROOF. Note that, for given a € ZZ, the event 
(2.29) {Bea,¢ satisfies condition Aj ,} 


is measurable with respect to o{Xy:lo0(V, Bea) < pl}. Indeed, the event that 
there exists a lattice animal y* C Ba, such that S(y*) = G By , > c£? is measur- 
able with respect to o (Xy: v € Bea 7). The event E that D(y, y *) < p£ whenever 


610 A. HAMMOND 


Y € ABe ,[2] Satisfies |y| > (logm)° occurs if and only if for each such y, there 
exists a path Tt, ,« of white sites of length at most p£ that has one site w € y* and 
another in y. Each site € € t, « satisfies £o5(5, Bea,e)  £oo (5, W) < p£ because 
w € y* C Beae, whereas each site v of y satisfies £o; (v, Bea e) < 24. Thus, the 
event E is measurable with respect to o (X, :d(v, Bea e)  max(p, 2}£}. As p > 2, 
this establishes (2.29). 

For any pair aj, a2 € Z4 that satisfy |a; — a2] > 2p + 1, 


(2.30) [v ez’ EALA Bia, 2) < pe} f lv EZ : £oo(V, Beaz) < pe} =p 


From (2.29) and (2.30), it follows that the events {a; is £-active) and {ax is 
£-active} are independent. Noting that P(a is active) is independent of a completes 
the proof of the lemma. L] 


LEMMA 2.6. Let o > 0 be given. For any q € (0,1), there exists € > 0 such 
that if t is any o -near percolation in ZÓ of parameter exceeding 1 — e, then there 
exists an independent percolation t' of parameter exceeding q such that v! Gt 
almost surely. 


PROOF. The statement of the lemma is implied by Theorem 0.0(1) of [17]. © 


Recall from page 23 of [6] that the percolation probability 0 (p) of a percolation 
of parameter p is given by P,(|C(®)| = oo), where C(O) denotes the cluster of 
open sites containing the origin. We make use of the fact that 


(2.31) 0(p)—1 as p — 1, 


which is implied by Theorem 8.8 of [6] (there, the result is being asserted for bond 
percolation, but the arguments used are valid for the current case of site percola- 
tion). By Lemma 2.6 and (2.31), we may choose ô > 0 such that any (20 + 1)-near 
percolation of parameter exceeding 1 — ó contains a percolation P whose parame- 
ter p is such that 


(2.32) 0(p)-1-—e. 


We now fix £ € N such that qz,4,c,» = 1 — 6/2. (We will also be requiring that £ is 
sufficiently high relative to A, p and c, but we prefer to state the particular bounds 
that are needed as they arise.) By Lemma 2.5, we may find such a percolation P 
satisfying P C (a € Zf :a is active}, where, now that £ is fixed, we write active 
in place of £-active for the rest of the proof. For any n € N, we write throughout 
the proof n = F£ +r with F € N and r € {0,...,@— 1} so that F implicitly 
depends on n. For each greedy lattice animal £ in B,, let We denote the collection 
of £-boxes of the form B; ¢ that are contained in B, (i.e., a € (0,..., F — 112) and 
which & intersects. On several later occasions, we will use the following definition 
and lemma. 


GREEDY LATTICE ANIMALS 611 


DEFINITION 2.5. For P a percolation on Z^, we write Pr.c for the largest 
connected component of P N {C,..., F -1— Cyl. 


LEMMA 2.7. For any j €N, and P a percolation on Z of supercritical pa- 
rameter p > Dc, we have that 


P j 
(2.33) lim int LL —0(p) almost surely. 
= n 


PROOF. We prove the lemma in the case where j = 0, the other cases being 
no different. Let Po. denote the unique infinite component of P. Given a € (0, 1), 
let the event Q, (a) be given by 


Q, (a) = {da connected component C, = C, (ao) of P N B, such that 
(2.34) if D C PB, is connected and has radius exceeding an, 
then D c Ch}. 


(Recall that the radius of a connected set C C ZZ is given by rad(C) — 
sup, yec IIx — yll-) It follows immediately from Theorem 5 of [22] that there exists, 
for each a € (0, 1), a constant c(@) > 0 such that 


(2.35) P(Qn(a)) = 1 — exp(—cn], 


for all sufficiently high values of n. (This result may also be derived as (2.24) of [1] 
was.) The Borel-Cantelli lemma applied to (2.35) implies that Q, (o) occurs for 
all but finitely many n. We claim that, for each a € (0, 1), 


(2.36) Pœ N (I2an],...,n—1-— [2an]Y! € C, (a), 


for all n sufficiently high. To derive (2.36), consider x € Pæ N {|2an],...,n — 
1 — |2an|}¢. Note that the connected component of P N Bẹ in which the site x lies 
has a radius of at least |2an| > on. Thus, if Q,(@) occurs, then x € Ca. Hence, 
we obtain (2.36) for high values of n. 

We bound 


ICs] = [Poo N (I2om ], ..., 5 — 1 — 12on])4] > | Poo Bal — 4dan? 
so that 
hs —d € —d - u = 
(2.37) limintn |Cn(o)| > limintn | Poo D By| — 4da = 80 (p) — 4da, 


the latter an almost sure equality that is due to an application of the ergodic the- 
orem to the process P. By the definition of the set C, (œ) appearing in (2.34), we 
have that Ca (æ) = Pn,o, provided that P,,o has radius at least an. If o € (0, 1) 
is chosen to be small enough that 0(p) > 4da + 2a , then (2.37) implies that 


612 A. HAMMOND 


[Cn (o) | > on? for high n, whence | Py,o| n4 for such n. Noting that, for any 
finite connected set B C Z4, 


Bc minx, max x | Xx [minxa, max x, 
xcB xcB xceB xcB 

and that maxyep Xj — minyeg X; < rad(B) for each i € {1,...,d}, we find that 

|B! < rad(B)?. We deduce that the radius of Py. exceeds an, for high n, so that 

indeed C,(a) = P4,o for such n. Given that (2.37) holds for each w € (0, 1), we 

deduce that 


oc —d 
liminfn "|Pso] 2 6(p), 


as we sought. |j 


We define 
E, = {for some greedy lattice animal £ in By, 
there exist aj, a2 € Pr,jpj41, such that Bga,,; € We and Bea, e ¢ We} 
and 


Ez = {for some greedy lattice animal & in By, 
(2.38) 
we have that We N { Beae:a € Pr,|o}41} = Ø}. 


We will now prove that the events Z; and E» occur for finitely many values of n 
almost surely. In the case of E, we first show that, for all n sufficiently large, 


(2.39) {Ln > (log £)°}N E =Ø. 


To derive (2.39), suppose that the event on the left-hand side occurs. The set 
PF} p|+1 being connected, we may suppose that a; and a» are adjacent. Let ya, de- 
note a lattice animal playing the role of y* in the condition Ay c that Beg, p sat- 
isfies. We may locate a connected set $ C & satisfying |@| = | (log 2)? | + 1 and 
$ N Beaj, e # Ø, because Ly, > (log £)’. Requiring that £ > | (log £)? | + 1, we see 
that @ € Bea, [2]. The fact that Bea, ¢ satisfies condition AP , implies that there 
exists a white path Tah Of length at most o£ from some site of ya, to some site 
of $ C &. Consider the lattice animal £+ = £ U ty, 6 U Yaz. Note first that £+ is 
connected, because & and ya, are, and cj, ,; is a path between them. Second, note 
that £* C By; indeed, £ C Bn, ya, C Bean e € Bn, while Ty, is a path of length 
at most o£, with Ty, d N Bea, e £ Z. Since a2 € Pr.|p|41, 


£00(Beaye, Br) = (Lo] + l= p£, 


and thus, Tya, © Bn. 


GREEDY LATTICE ANIMALS 613 


We bound from below the weight of £^: 
SET) = SE) + S (Va) + S (Tya, \ EU ya)) 


> SE) + S(va;) — Alty_,.4| = SE) + S(Yan) — Ape, 


where, in the equality, we used ya, NE = Ø, which follows from ya, C Beaz, € We. 
In the first inequality, the fact that the path z,,,5 is white was used. From 
S (Ya) = c£4 and (2.40), we deduce that S(£*) > S(&), provided that £ was cho- 
sen so that £ > (Ap/ c)!/ oon). We have, however, shown that &* is a lattice animal 
in Bn, so this contradicts the fact that € is a greedy lattice animal in Bn. We have 
proved (2.39). That E, may occur for only finitely many values of n almost surely 
follows from (2.19). 

It remains to rule out the implausible scenario that a greedy animal in B, might 
fail to meet any £-box Bea with a € Pr,|p)41. To prove that E» may occur for 
only finitely many values of n almost surely, we first show, for any € > 0, for x 
satisfying 


(2.41) x >(1+e)d/(d-1), 
and for all n sufficiently high, that 
(2.42) {Ln > (logn)X)n {x e B, => |Be(x)| < (logn)*)n E; = Ø. 


(2.40) 


To derive (2.42), suppose now that the event on its left-hand side occurs. We will 
connect a greedy lattice animal € in B, satisfying the condition in (2.38) to a 
weighty lattice animal Y in B, formed from animals in boxes corresponding to ac- 
tive sites. To construct V, note that for each pair of adjacent sites aj, a2 € PF 5,41, 
we may find a white path da, a, from ya, to ya, of length at most o£. This is be- 
cause the condition AY , Satisfied by Bea, ensures that |ya,| > (log £)? + 1, so 
that the path $4,,4, may be found by putting y = ya, in the condition AQ , satis- 
fied by Bea, 2. We set 


(2.43) Y= |]J wu U da, .a5- 


a€ Pr, i pj4+) 41 ,42€ PF |o|41:|a, —a5|—1 


We now check that W € Æpg,. It is connected, because each ya is, and to each 
adjacent pair (aj, a2) of sites in the connected set Prp.,,4.1, there corresponds 
a path $a,,3, that joins ya, and ya,. Note that for aj,a2 € PF,tp]+1, we have 
that Loo(Óa, a5; By) z Coo (Ba, t, Bj) laia] = (Lo] Tee pt > 0, so that 
aia € Bn, and thus, V C. Ba. 

Note also that 


— £d 
(2.44) WI [Papia] FQ 7 2 “= ae, 





the second inequality following for high choices of n from Lemma 2.7 and (2.32). 


614 A. HAMMOND 


Let £ be a greedy lattice animal in B,. We are aiming to find a path from Y to & 
that is white except perhaps for its endpoints. We may thus assume that V NE = Ø, 
the other case being trivial. Any set F properly separating V and £ in B, satisfies 


1l . " » 
IE z min{lé | ue pay 
(2.45) 


> i min | dogm 0-2 = g)l- V/d (n — pe > (log n)!** 
L 2d , £d-1 zm P 


for high values of n. In the first inequality here, we made use of the second asser- 
tion in Lemma 2.3 (which requires that Y (15 = 2), while the second is valid by 
the occurrence of the first event on the left-hand side of (2.42) and by (2.44). The 
third is due to (2.41). By the occurrence of the second event in (2.42) and (2.45), 
no black £-cluster properly separates € and WV in B,. Applying the first part of 
Lemma 2.2 with the choices C = £, D = V and E = {black sites) V (E U V), we 
deduce that the black sites do not properly separate E and V in B,, and thus, we 
may locate the desired path 7, y in B, from x € $ to y € V that is white with the 
possible exception of its endpoints. 

We need a reasonably short white path from & to ®. The path 7 y in principle 
could be very long. We can make use of it, however, in constructing a short white 
path. To do this, consider for now the case where x and y are white. Let ty y = 
(X = X9, X1, ..., X = y) denote some path from x to y in B, for which r < dn and 
whose sites may or may not be white. We now modify tx , to form a white path o 
from x to y in B, such that 


(2.46) lo| x dn(3 — 1)(logn)!**. 


If txy is white, we are done. Otherwise, let rj € (0,...,r — 1} denote 
the smallest value for which x,, is white and x,,41 is black. Note that 
Xr, € Ovis(x),n(Bn,£(Xr,41)); indeed, (X4,,X4 1,..., X90 = X) is disjoint from 
Bn, c (Xr, 1). We claim that there exists rz € {r1 + 2, ..., r} for which 


(2.47) Xr, € Ovis n (En, e (Xr 11). 


For, taking r2 = 1 + sup{r’ € (r1 + 1,...,r — Ll}: xp € Br e(Xr 41)), the path 
(X, ..., Xr) is disjoint from By, ¢(X-,41) and may be prefixed to the reversal 
of Ty , [which is white and hence disjoint from By, e (x,,..1)]. The result is a path 
from x;, to x in B, that does not intersect Ba, ¢(X,,+1). Thus, (2.47). 

By the second assertion of Lemma 2.1, we may find a path (X4, = yo, yi. ..., 
Yra = Xr) € Ovis(x),n Bn, c (Xr, 41). The same path-altering procedure may be ap- 
plied to 


(X X6; X1, 225 Xr eg View = Xp Xp Y. 


and then iterated. The effect of each iteration is to replace the passage of the 
original path through a black £-cluster by one in its visible boundary with re- 
spect to x. After at most r < dn applications, the iteration produces a white path 


GREEDY LATTICE ANIMALS 615 


from x to y in Ba. The length of the path increases by at most max{|Oviscx),n(B)| — 
]: B a black £-cluster in By} at each step. Note that |9yiscx),n(B)| < (34 — DIBI < 
(34 — 1) (logn)!**, the latter inequality by the occurrence of the second event on 
the left-hand side of (2.42). Thus, setting o to be equal to the white path that the it- 
eration produces, we have obtained (2.46). In the case where at least one of x and y 
is black, we may recolor them white for the course of the argument to produce a 
path o which is white except for its endpoints x and y. 

We form the lattice animal  — W Uo UE. We showed after (2.43) that ¥ C Bn, 
from which ® C B, immediately follows. We compute 


$S(6)—S(E)-- 5, Sa) 


acPr,|o]-1 


+8((eu U e \ ( L] wus) 
21,82€ Pr, | oj+1: fay —a2]—1 a€ PF i pj+1 


> SE) + | PF, Loj lct 
(2.48) — A (la| + o£|[(a1, a2}:a1, a2 € Pr, 15], lai — a2| = 1]]) 
> SE) + (n — D —&)c 
— A(dn(34 — 1)(logn)^* +d| Pr i5 j.1lo£) 


= 8G) Tn ^a xm m E Cn? le — Adn(37 — 1)(logn)'^*, 


where, in the equality, we used the fact that ya (1$ = Ø for each a € Pr.1541, 
which follows from Wz N{Bea,e:a € Pr,|o|41} = Ø. Regarding the first inequality, 
note that, if either of the endpoints x and y of o is black, then that endpoint lies in 
Uae Pr pjp Ya U E. In the second inequality, we made use of (2.44) and (2.46). The 
third follows from the fact that |Pr,|o|+1| F^. Provided that £ was chosen so 
that £ > (dApc- (1 — e) -1)!/4—-U, the inequality (2.48) is admissible for at most 
finitely many values of n € N, edice S(®) > S(E) would contradict E being a 
greedy lattice animal in B,. Thus, (2.42). 

We now check that the two events other than E» that appear on the left-hand 
side of (2.42) may occur for only finitely many values of n. That Ly > (log )* for 
all high n is implied by (2.19). Note also that 


P(3x € B, :185(x)| > (logn)!**) < n^P(|5.7(0)| > dogn)'t*) 
« n? exp{—c(log n)ite }, 


with the constant c = c(A) > 0 by Theorem 2.4.2 of [7] and (2.21). Since £ > 0, 
we see from the Borel—Cantelli lemma that, for all high n, every black £-cluster 
intersecting B, has at most (logn)!** sites. 


616 A. HAMMOND 


We conclude that the event E? may occur for only finitely many values of n 
almost surely. Note that E? N E5 occurs if and only if for each greedy lattice animal 
& € Bn, We D {Beae:a € Pr, 1541]. This completes the proof of Theorem 1.1. O 


3. Existence of G; proof of Theorem 1.2. In this section we strengthen a 
result of [3], with hypotheses as general as those of that paper. 


PROOF OF THEOREM 1.2. As we discussed after the statement of the the- 
orem in the Introduction, we need only consider a law F for which N > 0. In 
this case, we set y = liminfn 4G, and y+ E = limsupn 4 Gp. Note that y, 
E€[)]zj9lYs:m > nj, where Y, is a vector whose components comprise 
{Xy:v € B, — Bn—1} in an arbitrary order. The family of random variables 
(Y,:n € N} being independent, Kolmogorov’s zero—one law Theorem 1.8.1 of [4] 
implies that y and E are almost sure constants. It thus suffices for proving the 
theorem to show that E > & > 0 results in a contradiction. To this end, we aim to 
produce a box percolation whose members contain lattice animals whose weight 
per unit volume is close to the value y and which may be connected by paths 
of negligible weight. If the sidelength of the boxes is chosen to be large, then 
the percolation has a high density. As such, the animals lying in members of the 
largest connected component of the box percolation inside any very large box may 
be joined to form a well-spread lattice animal of weight per unit volume close 
to y. The assumption that E > & should allow us to identify lattice animals whose 
weight per unit volume exceeds y + e, which may replace parts of the constructed 
animal, thereby increasing its weight per unit volume. These heavier animals must 
be found in a uniform fraction of space and be capable of being joined to nearby 
structure at negligible cost if a sufficient increase in weight is to result from the 
proposed modification. 

The constant N being strictly positive, Theorem 3.2 of [3] implies that y > 0. 
For this reason, we may apply Lemma 2.4, and shall set A « oo, 2 « p « oo, 
e < (y/2)5 and c € (y — e/594, y) such that lim, , oo Qm,À,c,p = 1. By the argu- 
ment presented after (2.31) in the proof of Theorem 1.1, we may find £ € N so that 
there exists a percolation P whose parameter p satisfies 


(3.1) o(p) > 1—6/QM y) 


and for which P C (a € ZZ :a is active}, where, now that £ is fixed, we write active 
in place of Z-active for the rest of the proof. (We also require that £ be chosen high 
relative to A, p and also to e. We state the precise bounds as each one arises.) 
We will use n € N to denote the large scale in the proof, that of the patchwork of 
joined animals, and, similarly to the proof of Theorem 1.1, will write n = FE +r 
with F € N andr e {0,..., £ — 1}. We will denote the process (a € Zf : is active] 
by P throughout the proof. The reason for this choice of notation is that we will 
repeatedly make use of properties satisfied by the process P that are inherited from 
its subset P. 


GREEDY LATTICE ANIMALS 617 


By Lemma 2.7 and (3.1), we find that, for any given C «€ N, and all n (and 
thus F) sufficiently high, 


E€ 
(3.2) IPr.cl> (1 e sas) PA. 
28d y 


Similarly to the definition of Pr c, let P r,c be the largest connected component 
of P A (C,...F — 1 — CY£. From the inclusion P C P, we learn that Pr c lies 
in some connected component PO{c,...,F -1-—- c}4 . Hence, Pec € Pro, 
provided that | Pr c| > LF 4 This condition is ensured by (3.2) and the assumption 
that £ < 284-1y, Hence, for each C € N and for all high enough values of n, we 
have that 


(3.3) Prcl> (1 - d rd 

Each member of the process {Bea ¢:a € P of £-boxes satisfies condition AÑ B 
and thus contains a lattice animal of weight exceeding (y — £/594)47 that may be 
connected to another such in an adjacent box. We will obtain a backdrop lattice an- 
imal by joining together such animals that lie in £-boxes Bea e for sites a belonging 
to a large connected component of P N Br. We now make precise the notion of 
the heavier animal instances of which we seek to stitch into this patchwork. The 
following definition is convenient. 


DEFINITION 3.1. For any m € N, m > £ and an m-box [ = By », we set, for 
qcN, 
r(é,q) 2 | _J{Bev, elg]: v such that Bey e AT AB} 


equal to the union of those £-boxes whose sup-norm distance from some £-box 
intersecting I is at most q. We also write 


wr = {a € Z: Bea NT AS} and Dælwr,q)= (v € Zf: lol, wr) <q}, 
so that I'(£, q) = U{Bev e :¥ € Doo(wr, q)}. 
DEFINITION 3.2. For m > £, an m-box I = By m is said to be (c1, 4, p)-high 
provided that: 


e there exists y* € Ap such that S(y*) = Gr > (y + cim? and D(y, y*) < pm 
for all y € Arp such that |y| > (log m)^, 

e any two sites V1, V2 € Dos (wp, llog(m/£) )) V Dos (wp, llog(m/£)| — 1) that are 
connected by a path in P A Deo(wr, [log(m/£) |), are connected by a path in 
P N (DooQwr, llogGn/£) ]) \ wr). 


We write "high" for (e, A, p)-high. 


618 A. HAMMOND 


We now construct a disjoint collection of high boxes that fill out a uniform 
fraction of a large box in Z. 


LEMMA 3.1. For any m, > £L, there exists m > mı, and a collection Kg ,m2 
of high boxes in Z such that if T = Bym € Km,,m,, then m € [mi,..., m2}, if 
l1, T2 € qm, then Dil] 1 P2[1] = 9, while 














I 
TE —d 
(3.4) lim inf m (U r) N Bm| > 27d 
rek 

and 

! E 1 
(3.5) limsupm à (U r) N By| S 37 

rek 
PROOF. We claim first that 

(3.6) P( Bm 1s high for infinitely many m € N) = 1. 


To derive (3.6), note that the box Bm satisfies the first requirement in the 
definition of high provided that it satisfies condition Aj ,,,. The fact that 


limsup, m 4G, = y + E > y + implies that there are almost surely infinitely 
many values of m € N for which there exists E € Ag, such that S(£) = Gm > 
(y 4- :)m?. The proof of Lemma 2.4 may be applied to show that condition AY it 
holds for all but finitely many of those m for which such & exists. 

To handle the second requirement, suppose that for a pair of sites vj, v2 € 
Ds (Wy llog(m/£)]) X Doolwe,, |log(m/£] — 1), we may find a path 
(v1 = X4, X2, ..., X, = V2) with 


x; € PM Doo(ws,,, llog(m/£) |) 


for i € {1,...,r}. If v4 and vz are not connected by a path in P N (Doo(wp,,, 
Llog(m/£)|) \ we,,), then the path (x1,..., x+) visits wg, and we can set rj = 
infi € {1,...,r}:x; € wp,] and r2 = sup{i € {1,...,r}: xj € wz}. The fact that 
v; and v» are not connected by such a path implies that the sets C = [xi, ..., X441] 
and D = {X,,41,...,X,} are separated by P^ in Doo wp, , |log(m/4£) ]) \ ws,,. 
Put differently, the sets C and D are separated by E = P“ U wg, in the 
box B(ws,,llog(m/£)]). Noting the pairwise disjointness of C, D and E, we 
may apply the first part of Lemma 2.2, to deduce that there is an £-con- 


nected set x C P'Uuwg, that separates [x1,..., X44 1] and {x,,41,...,X,} in 
Doo (wg, , |log(m/4) |). Note that 
(3.7) x N (Dos(wp,,, i) \ Doo(wp,.i — 1) EG 


for each i € {1,..., |log(m/£) |}. To see this, note that loo (x1, wp, ) = Llog(m/£) | 
and X,, € wg, imply that for any i € (1,...,|log(m/£)]), there exists jı € 


GREEDY LATTICE ANIMALS 619 


{1,...,71 — 1} for which Zo5 (Xj, wp, ) = i. Similarly, there exists jz € {r2 + 
1,...,r) for which Zoo (x5, wp, ) =i. Let (Kj, — y1, .... yr = Xj) denote a path 
from X, to x; in the connected set Doo (wp, , i) \ Doo(wp,,, i — 1). This path must 
intersect x by the separation property that x satisfies. We have proved (3.7). 

We have seen that if the second requirement for the box B, to be high fails, then 
there exists x € Dos (ws, , |log(m/£) |) such that |\Cpe e Ql = [log(m/£)], where 
Cp p(x) denotes the £-cluster of P" that contains x. Using the fact that P C P 
and applying a union bound, we see that, for high values of m, this latter event has 
probability at most 


(3.8) |Doc(wn,,. [log(m/£) |) [P(IC pe, c (0)| = Log(m/£) |), 
where C pc, e (X) is of course defined analogously to Cpe (ef). Note that 


P(ICpc (0) > Uog(m/£)})< Y oal- p, 


n=|log(m/£)| 


where o; denotes the number of £-clusters containing 0 and having n sites. Given 
that the sequence (o, :n € N} grows exponentially (see [8]), we find that 


P(ICpc, e (0)| > (10g6m/£)1) x exp(—clog(m/£)], 


where c = c(p) is a positive constant satisfying c(p) — oo as p f 1. For this 

reason, we may suppose that p has been chosen so close to 1 that the sequence 

in m of terms (3.8) is summable. [This lower bound on p is ensured by fixing £ at 

a high enough value, in the same way that (3.1) was obtained.] We deduce from the 

Borel-Cantelli lemma that the second condition in the definition of "high" applies 

to all but finitely many of the boxes B,, almost surely. We have shown (3.6). 
Given m € N, there exists by (3.6) m2 > m, for which 


(3.9) P( By is high for some m € {m1,...,m2}) > 1/2. 
Let A = (v € Zf : B, m is high for some m € (m1, ..., m2}}. We claim that 
(3.10) (x € A} eo [Xy:£oo(v, Bx m,) < oma], 


provided that mı (and thus m2) is high enough relative to £. To verify this claim, 
note that, by (2.29), the first condition that defines a high box is determined by 
the values of the random variables Xy, appearing in (3.10). The second is mea- 
surable with respect to the c-algebra o(a € P N Doo(@By ,,, 10g(m;/0)))- Recall- 


ing the property (2.29) of measurability satisfied by the event a € P, we find 
that this second condition is measurable with respect to o (Xy :£oo(v, By m,) < 
£(log(m2/£) + 1 + p). A sufficiently high choice of mı relative to mz therefore 
ensures that (3.10) does indeed hold. 

Let (z; : j € N} denote an ordering of N that enumerates each shell B; \ B; 
in turn. Setting m = |om2| + 1 and using the fact that By, N By a =Ø if 


620 A. HAMMOND 


Ix — x'|| => m, it follows from (3.10) that the sequence of events {a + mz; € A: 
j € N} is independent, for any given a € B. This sequence is identically distrib- 
uted because the law of (Xy:v € NÎ} is translation invariant. As such, the strong 
law of large numbers (Theorem 1.7.1 of [4]) may be applied to the sequence of 
random variables 1[a + mz; € A: j € NJ. By considering the sequence of partial 
sums corresponding to those j at which the enumeration of the set B; is completed 
by z;, we deduce from (3.9) that 


|Bm NAN {a+ ANÉ} 
|Bm N {a + AN? | 


From (3.11) and the fact that |Bm N (a + ANE} = mr 4 + m'-4O0(m7"), it 
follows that 


limm~*|Bm N Al 


(3.11) the limit lim exists and exceeds 1/2. 


(3.12) —limm ^ ^ |Bm N A (a + AN’) 
ac Bp 
Bm 1A n(a-mN4^) 1 


- limp + 4 0(m-75)] Y: 8 DOR. 
Dp [Bm N (a + mN} 2 


For X € A, let Ty = By, with m € (mi,..., m2} maximal such that By m is high. 
Let A’ = (F'y:x € A}. It remains to disjointify the collection of boxes A’ while 
retaining enough of its members so that their union has positive density. To do 
this, enumerate 


A’ = {Byn mim e {m1,...,m2}, j EN}, 


so that {Xm,; : j € N} is an ordering of those x € A for which Ix has sidelength m. 
We will iteratively examine the indices (m, j) that label members of A’, admitting 
one at each step into a set of accepted indices A while at the same time placing 
others in a set of rejected indices R. We will allow these symbols to denote those 
indices currently accepted or rejected at each step, without using further labeling. 
At the start, A = R = Ø. We begin by examining the indices {(mz, j): j € N} 
At the first step, we put (m2, 1) in A, and reject (put in R) those (m, i) [except 
for (m2, 1)] for which By, [1] N Bs, , i mPll Æ Ø. At the generic step for boxes 
of sidelength m2, we put (m5, i) in A, where i € N is minimal for which (m2, i) is 
not currently in A UR, and put in R, 


{(m, j):m € (mi, ..., m3), j EN, | 
(m, j) Æ (m2, i) . Bym jm] N Bym, i mill] Æ gj. 


(Note that some of these indices may have entered R at an earlier step.) After 
at most countably many iterations, (m2, i) € A U R for each i € N. We proceed 
to deal with those ((m,i):i € N} not yet in AUR, for m = m2 — 1, then for 


GREEDY LATTICE ANIMALS 621 


each m in descending order until we finish with m = m. At the generic step when 
m € (m,,...,m»5) is some fixed value, (m, i) is admitted to A for the least 7 for 
which it is not already in A UR, while all those other (m’, j) for which By, , [1] 
intersects By , „m l1] enter R. At the end of the procedure, each (m,i) lies in 
AUR. We set k = {Bx,, i m: Qn, i) € A}, with A now denoting the collection of 
accepted indices at the end. 

The first two properties asserted for the collection « follow directly by its con- 
struction. We claim that 


(3.13) LIT eL TIS. 
rea’ l'ex 

To show (3.13), note that each index (m, i) of some box in A' is eventually either 
accepted and so lies in « [so that the box certainly lies 1n the set on the right-hand 
side of (3.13)], or 1s rejected by the algorithm. If it 1s rejected, consider the index 
(m’, j) whose admission to A resulted in (m, i) joining R. The key point is that this 
can only happen if m’ > m, because all boxes in A’ whose sidelength exceeds m’ 
have been dealt with by the time (m’, j) is admitted to A. This fact along with the 
criterion for the rejection of (m, i), namely By , pm [l] N By, mM] 5 Z, imply 
that By, , m [3] 2 Bx, i,m- Thus, (3.13). We may now estimate 


(U r) os: 


rek 


> UT ec rc Bj) 


= >. fer 3, Pp 


lec, CB; lec, CB; 


> 774 Jtr] T ek, C By) 








> 77 (LJtrt:r ei) n {x € Be: Loo(x, BE) > 4mà]| 
> 7 (LJtr:r e A) A fx € Bets BE) > 4m2)| 
> TIAN {x € Bg: foo (x, BE) > 4m }| 

> 741A N Bel — 8dm 772 kt! 


> (1/2 -0())777 d — 02), 


which implies (3.4). In the first equality, we used the fact that x is a disjoint col- 
lection of sets. In the second equality, we used the fact that |y [3]] = 72|y |, which 
is valid for any box y. In the second inequality, we used that if for any By m € K, 
there exists X € By N By m such that loo (x, By) > 4m», then By m[3] € Bg, this fol- 
lowing directly from m < m». In the third inequality, (3.13) was used and, in the 
fourth, that x € A => x ET. In the final inequality, (3.12) was used. The prop- 
erty (3.5) is derived by a similar estimate, that makes use of the disjointness of the 
collection (T [1]: T €«} and the fact that | [1]] = 37|L'| for any box r. O 


622 A. HAMMOND 


In the application, we insist that the value of m; € N satisfy the inequality 


p^ E 
We will mention conditions stipulating that mı must be high relative to £, p, À 
and c as they arise. We will be joining animals lying in the boxes of x to a structure 
of joined lattice animals that lies in £-boxes. To do so, our first step is to make space 
in the fabric of joined lattice animals for the high boxes lying in x. We now claim 
that, for n sufficiently high almost surely, 


(3.15) Pre U Doolin llog(m/0)]) 


By m EK 


lies in a connected component of P F, Lol+1 \ Urge wr, where Cy = [mo/£ + 
21og(m2/£)| + [p] +2. To show this, consider aj, a» that lie in the set on the left- 
hand side of (3.15). Let t = (ay =y1,.-.., Yr =a2) C PN{Cj,..., F—1—Cy}4 
be a path from a, to a7. We aim to modify the path t to find a new one from 
a; to az that avoids each of the sets wr for I € x while staying in P N {|p| +1, 
..., F —2 — |p|}%. We may assume then that v does not itself satisfy these re- 
quirements, so that T N wp, m # Ø for some By m € K. Set 


rj =inf{i € {1,...,r}:y; € Doo(wa,,,, Llog(m/£)])} 
and 
ro = inf{i € (ri +1,...,r}:yi ¢ Doo (W Bx m» Llog(m/£)])} ax i 


Note that rj > 1, because a; € Doo(WBy m» |log(m/£) |), and that £55 (y, , WB, m) = 
|log(m/£)| for i € (1, 2). The subpath (yj, ..., y,,) is an excursion of t inside 
the set Doo (wp, ,,, Llog(m/£) |). The m-box By m € x being high, we may, by the 


second requirement in the definition of “high,” find a path (y,, —z1,..., Zr; = yr) 
such that 
(3.16) zi € P N (Doo(wp,,» llog(m/£)]) NVwn, ,.) 


for i € {1,...,73}. Note that 
Coo (zi, BE) = inf{Lo0(z, Bp) :z such that £oo(z, wp, m) = Llog(m/£) |] 
(3.17) > C56 (97s BS) — (Im/£] +1+2[log(m/2£)]) 
> Ci — (Im/£] + 1+ llog(m/£)]) > Lo] +1, 


the third inequality valid by y,, € (C1, ..., F — 1 — Ci} and the fourth due to 
m < m2. For any By m € k for which (x', m^) Æ (x, m), we have that 


(3.18) — zi € Doo(WB, m> log(m/£)]) € Doo(wn, w» l1og(m'/£)])', 


GREEDY LATTICE ANIMALS 623 


where the containment follows from By m[1]M By',,;[1] = Ø and a choice for m; 
that satisfies mı > C£. By (3.16), (3.17) and (3.18), we see that the path 


(Ay 1755 Yn 9 Elsosco Zoe, Yr Se) 
c Pn(Lo| T L..., F — |e] - 2) 


has removed any instance of a visit to wp, ,, during the excursion in Doo(WBy m> 
|log(m/£) |) from y,, to y,, without introducing any new visits to 


LJ Doo(way 4s Ulog(m' /0]). 


By! m! EK 


We modify the path in such a way, for each example of an excursion into the set 
Doo (we — |log(m'/£)]) for any By mw € x. After a finite number of such al- 
terations, we obtain a path @ from a; to a? in PN {|p| -1,..., F — |o] — 2)? 
that is disjoint from | J(w By mi ` Be'm € xk}; indeed, any excursion of $ in a 
set Doo (wp, m: Llog(m/£)]) will not visit wg, ,,, nor Doo(wgy „„ Llog(m’/£)]) 2 
WB, m by construction. We have proved that the set in (3.15) lies in a connected 
component of P r.|5j.41 V Ure, wr, as we sought to do. We call this connected 
component the backdrop and denote it by BD = BD(n, £). 
We will require the following lower bound on |BD]: 


(Ur)n&. 


rek 


Ca £ d 
619 IBDiz [Prci] - (1+ zgr)! 








To obtain this, note that 
IBD| > [P r.c, | — |(LU{Doo(w an» 086/01) : Bs € ]) A Pri 


> [Pr.c| — Ul Des(wn,,.. log(m/0)]) : Bum €x, Bsm € Bs] 


the first inequality being due to the definition of BD. The second is valid 
because m < m2 implies that Ci > [m/£] + 2 + 2|[log(m/£)]|, from which 
it follows that if Dos (wn, m, [log(m/£) ) n (Cr,..., F - 1 — Ci}? # Ø, then 
Doo (wp, m» log(m/£) |) € Br and thus By,» € Bn (for By,» Z B, => wa, O 
Bs, # ©). Note that 


(3.20) 


, 





(3.21) |W Be m| = m/e]? 
and 


|W Be m| x (1m/£] 4-2) 


(3.22) 3 
E m € | By mi 
<{1+—.— }— ={1+—___ }-— 
<( + sya) gd ( + saa) £d 





624 A. HAMMOND 
given that m, > C(e, y)£. We find that 
IDoo (wn, n log(m/£)])| € (Im/£] 4-2 4- 2[log(m/e)})4 
2(1 + log(m/£)) \4 
= (14 |m/£| ) |W Bym 


€ 
«(1 57) TA 
( 3. 10104 y IB 
2 
€ | Bx, m| 
LH gig oM dae 
<( + 3-390927) gd 


€ |Bx,ml 
<(1 + 10104 y ons) gd ? 
the second inequality by (3.21), the third by m; > C(e, y)£, the fourth due to (3.22) 
and the fifth from e < 3 - 10197 y, Thus, 


3 [Doo(W By m+ Log(m/2)|)| 


By m€k, By S C Bn 


€ -d 
(3.23) < (1 Toons r) > N 


rex, TC B, 
—d 
«(1 E io; is}! 


(U r) 1 Bal: 
rek 
By (3.20), (3.23) and the triangle inequality, follows (3.19). 

We now define the lattice animal Ê formed from animals in £-boxes corre- 
sponding to active sites of the backdrop BD and into which we will stitch animals 
from high boxes in x: let 


(3.24) w= |] va U Pa; a; 


acBD a;,a32€BD: |a; —a2|—1 








[Recall from after (2.39) that Ya is the animal y* in the condition AL. satis- 
fied by Braz, and, from before (2.43), that each A-white path da, a, satisfies 
iba; an| < p£, and intersects each of ya, and ya,.] 

There may be a few high boxes of x contained in B, whose greedy lattice animal 
(or animals) cannot be connected to W in the intended way, if it so happens that Ng 
does not reach into a neighborhood of these high boxes that would allow the greedy 
animals therein to attach to Ù. In addition, when a path can be formed from inside 
the high box to V, we must ensure that the path stays in B,, which amounts to 
insisting that the box be at a certain distance from the complement of Ba. We now 
define the set of high boxes in x whose greedy lattice animal we will connect to V, 
bearing in mind these two requirements. 


GREEDY LATTICE ANIMALS 625 
DEFINITION 3.3. Let the set of useful high boxes UH be given by 
UH mad { Bx.m EK: Bx,m c (Loma | F l, d QA = | om» ] em 2Y*, 


(3.25) 
BD ^ Des (w, n, Lm/(20)|) £ 2}. 


We now show that it is only a few boxes in « contained in B, that do not make 
it into UH. Specifically, we prove that, for all sufficiently high n, 


Ur (U r) N Bn 
rek 
To this end, note that 


leUH 
(urne Qu) 
rek l'eUH 
UU J[r ex NUH:T € (Loma] +1,..., 


n — [oma] — 2)*] 
U {x € B,:£oo (x, BE) < Loma | 4- m»). 


enf 


(3.26) - uy 


= 














(3.27) 


To show that the second set on this right-hand side is small, we adopt a temporary 
notation, saying that the box By ,, is “far from BD” if By m € x \ UH and By m C 
[Loma] 4-1, ...,n — Loma] — 2)4. For such a box, 


(3.28) BDN Doc(wn, m, (m/ 20)]) = Ø. 
Note that, for By m € K, (x', m^) £ (x, m), 
Dos(wn, m m/(2£)|) € (a € Z7 : Beae C Bx {1} 
(3.29) C {a € Zf : Beae C Be m {LI} 
C Doo(wn, „» M/Y, 


the first and third containments requiring that mı > 4£, since this ensures that any 
m > my, satisfies |m/(2£)| + 1 € m/£ — 1. The second containment follows from 
Bx sl11 N By m [1] = 2. We have that 


(3.30) Doo(wp, n, m/(20)]) € Br. 
Indeed, 


en 


pnm; < pm», < [om2] T 1 < Loo(Bx,m, Br) < £o0(WBy m> BM T 2L, 


so that (3.30) follows, given that mı may be chosen so that mı > 2£/p. We now 
show that the collection of sets 


(3.31) [P F.C; Doo(WBy m, Lm /(2£) ]) : Bx,m far from BD} 


626 A. HAMMOND 


is disjoint, with union cóntained | in Br. By (3.28), (3.29), (3.30) and BD C Br, 
we find that (3.31) is true with P rc, replaced by BD. However, if there exists 
y € Pr,c, N Doo wp, m, Un / (20) ]) for some By m € K, then we may locate 


(3.32) z€ Prc, N (Doo(WB, m, Um/ Q4) ]) \ Doo(wn, ,,. (log(m/£)])). 
To find such a z, we may assume that y € Dos (wp, m, |log(m/£) |). Note that 


(3.33) [Prc| > (m/£ 4-1 4- 210g(m/0)) > |Doo(wa,,,, |1og(m/2)])]. 


where the first inequality is valid for high n by (3.3). Since the set Perc 
is connected, by (3.33), there exists a path in this set from y that leaves 
Doo(WB, m» llogGn/£) |). The first vertex of this path after it leaves Doo(W By m> 
llog(m/£) |) may be chosen as z in (3.32). 

From (3.32), (3.29) and 


Doo(WBy m ki |log(m TOJE C Doo(WBy n Ps |m '/Q4)]), 


it follows that z belongs to the set in (3.15), so that z € BD. By the fact that the sets 
in (3.31) are disjoint if P p. c is replaced by BD, we find that the box By m cannot 
be far from BD, that is, the collection of sets in (3.31) 1s indeed disjoint. 

We estimate 


LUIT far from BD) | = S wise? Y) gue 


T far from BD T far from BD 


(3.34) <0 5 [Doo (WBgms (m/Q2))] 
By, m far from BD 


elt Fd enf 
28d y 28d y 
where the second inequality follows from the fact that for any By m € K, 
| Doo(W Be m,tm/(2e)|)| z Qm/£ — 3**, 


allied with (3.22), the inequality £ < 3 10102 y (27 — 2) and the lower bound m > 
mı > C£. The third inequality in (3.34) follows from the claimed property of the 
collection in (3.31), and the fourth from (3.3). We may now bound 


Seve (Ur) 98] - ser, 





< t (F* —|Prc,|) x 


— |(x € By: £o0(x, BE) < Loma] + ma)] 





























leUH 
- (U Jul T — 2d (Lom ] + m3)n*^ 
rek 
> (U r) NB ent 
= ^| 98d—-1,)’ 
rek 2 y 


GREEDY LATTICE ANIMALS 627 


the first inequality valid by (3.27) and (3.34), the final one valid for all high n. We 
have obtained (3.26). 

We need to connect greedy lattice animals in the boxes of UH to the backdrop 
animal V. For I e UH, consider then y* = Vp € Ar, provided by the first of 
the two conditions that the high box T satisfies. As F € UH, there exists a € BD 
for which £.9(a, wr) x [m/(2£)] if T is of the form By». Any y € ya € Bia; 
satisfies 


(3.35) looly, T) x £(1m/(20)] +1) x m/2 +22 <3m/4, 


given that m > m, > 8£. Note that, by the disjointness of (ya:a € BD}, (3.19), 
(3.24), (3.3) and (3.5), 


T — zd. d — —d 
(3.36) |» IBD] > (1 Jd 2; )F 2.3 (14 ior; TEN ig; Je" > Ee d 


for high values of n, given that € may be chosen so that ¢ < y(2—84 4 
2.374 .107104)-1(9/10 — 2 . 374). Given that m is at most the fixed constant m2, 
we may by (3.36) find a lattice animal x € V with Ix| = Ldogm)?] + 1 and 
ye x. If ze x, then £..(z, r) x |dogm)? | + £y, D) <m, by (3.35) and the 
fact that we may choose mı high enough that for each m > m1, (log m)? < m/4. 
Thus x € l'[1]. By the first condition that the high box T satisfies, we may locate 
a A-white path or from a site of y* to one of x, with lór| < < pm. We can now 
define the lattice animal, modified from V in the way that we sought: 


(3.37) d=WuU |) Wier). 
leUH 


It remains to verify that ® has the required properties. It is indeed a lattice animal, 
for each y is connected to the animal jy by a path dr. We claim that ® C Ba. We 


may show that uU C Bn, in the same way that we showed that V C B, after (2.43). 
Note also that y CT € By, because I € UH. If y € op, m then 


(3.38) Loo y, Bx,m) < Lool¥s VB, m) < 65, < 0m < pma, 
the second inequality due to yg, „N È Bym # ©. However, 


(3.39) Bym UH ==> 6o (Bf, Brym) = Loma] +1. 


From (3.38) and (3.39), we deduce that y € B,. We have shown that ® C B,. Note 
that 


S= Y SQ) + > SOB 


(3.40) acBD l'eUH 


«s( LJ ĝru J tum) \ ( Uru l 4). 


lCeUH 21,22€BD: |a; —a2|-1 acBD leUH 


628 A. HAMMOND 


since, for a € BD, Ya C Bea,e is disjoint from any £-box intersecting any F € x, 
and thus from each xp CT’, by the definition of BD. We bound 


>> 504) = |BD|ct? 
acBD 


e d pd 
oan (i-is) - ia) 


- (1 ue; )» (U r) os, 


rek 
where S(ya) > c£? was used in the first inequality, the second is due to (3.19), 
(3.3) and y > c > y — €/(562). Note also that 


Bey 


l'eUH 


(U r) N Bn 
rex 

the first inequality following from the disjointness of the boxes I € UH and the 
definition of the animals yj, the second due to (3.26). Note that 


(3.43) ) me— SD Ms- 


By,m EUH mi leUH 


3 








X Sp = +e) 


leUH 








(3.42) 


d 

en 
> (y +£) —(y-Fe)zgr——. 
28d ly 








because m > mı for each By ,, € UH, and the collection UH is disjoint, with its 
union contained in B,. We find that 


(us, ae Nau) 


lecUH a; a2 EBD: |a; —a5|—1 
=a lt E Meal 
revH aj,22 €BD: |a; —a5|—1 
(3.44) > EZ X` m+ é|{{aj, a2}:a1, a2 € BD, Jai — 22] = yl 
Bx 5, EUH 


> —Ap(n / m! + dt F^) 


d 
En 
= 35d = dApt F?, 


GREEDY LATTICE ANIMALS 629 


the final inequality by (3.14). Substituting the bounds (3.41), (3.42) and (3.44) 


into (3.40) yields 
S(®) > [y +e- (Sr. 10184; zw; y (U r N B, 
EK 


e d pd 
Gies) 


€ 
ae D — ——(y + &)n? — 334 zn? — dapt F^ 
d 
AAO TEE A NIA 
1010dy J? |2.74 


+(1 - za; o7 su )e - 9" 


ud e od 
"2 Ly gui, 0 te T 35d" T gio" >? 








(3.45) © 


the second inequality using (3.4) and the inequality 2 > (3!°¢dape—!)'/@—- that 
we may require that £ satisfies. For large values of n € N, the dominant term in the 
last expression is the one in n^, whose coefficient is bounded below by 


1 J 
He cal 
2-74 2.74.1010 
(3.46) 
l I 1 1 I € 
. 98d Bd- 56d 75d 3104  958d-1y]' 


which strictly exceeds y, provided that e < y. 
The lattice animal may be formed for all sufficiently large n. We conclude 
that 


lim inf 5(Pn) > 
A> CO nd ys 





which implies that 
_ . Gn 
DES nd TY 
an inconsistency which completes the proof. Œ 
4. Proof of Theorem 1.3. We require a lemma. 


LEMMA 4.1. Let P denote a percolation of parameter p € (pe, 1]. For any 
C € N, there exists ng € N such that 


A Pac FØ, 


n-Hho 


630 À. HAMMOND 
where the sets P, c were specified in Definition 2.5. y 


PROOF. It follows from Theorem 7.2 of [6] and the assumption that p > pe 
that there exists almost surely an infinite cluster p of the process P N Zi where 
Zi = {ve Zf :v; -0O,iell,...,d)..Letxe PŁ. Note that the connected com- 
ponent of P N B, in which the site x lies has radius at least n — ||x||. Note that, for 
any o € (0, 1), if the event Q, (œ) defined in (2.34) occurs, and n > (1 — o)! [x], 
then x lies in the connected set C,(a), also defined in (2.34). Recall from the 
proof of Lemma 2.7 that, if a € (0, 1) is small enough that 0(p) > 2a + dda, 
then Ca (œ) = P, for high values of n. Recalling also that Q,,(@) occurs for all 
but finitely many n almost surely, we deduce that x € P, o for all high choices of n. 
The statement of the lemma for a positive value of C is obtained by translating the 
process P by the vector (—C, ..., —C), and applying the result for C — 0. DO 


PROOF OF THEOREM 1.3. Given e > 0, let C, £ € N and the percolation P 
be those to which the statement of Theorem 1.1 refers. By Lemma 4.1, there 
exists Fg € N such that we may choose v € (p. p, Pr,c. Let & € Ag, satisfy 
S(£,) = Gn and |£,| = Ln. Provided that n > Fo£ is also chosen to be so high 
that we may apply Theorem 1.1 to &, € Bn, we find that En N Beye Æ Ø. Let 
T = (t9, ..., Tr) denote a path in Ze such that 0 € t and Bey Cr. Let V = 
min(S(z?) : &? = (19,..., Ts), 5 € (0,..., 7}} be equal to the minimal weight of any 
initial subpath of v. For each sufficiently high n, we may choose s(n) € {0,...,7} 
such that 7° C B, \ En and 1° N 8&, Æ Ø. Note that 


(4.1) Nig, ||| = S(&, U po = S (én) + S(c5 9) > Grr +V. 
Given that |&,| = Ln, we find from (2.19) and Theorem 2.1 of [3] that 
(4.2) Meg epe] < (N + €)(Ln + Itl), 
for all n sufficiently high. From (4.1) and (4.2), we deduce that 
(4.3) Gn<(N+e)Ln+(N+8)|t|— V. 
Given that € > 0 is arbitrary, and that the path t is fixed, we obtain, by taking a 
liminf of the n^ 7th multiple of (4.3), the inequality G < NL that we sought. O 
5. Critical behavior, proof of Theorem 1.4. We aim to prove that the quan- 
tity N is positive under the assumption that, for some € > 0, 
(5.1) lim supn~! (logn) 7/V--*G, > 0 
noo 
with positive probability. This lim sup is nonrandom, similarly to limsupn 7 Gn, 


as explained at the beginning of the proof of Theorem 1.2. Hence, the hypothe- 
sis (5.1) allows us to fix 6 > 0 for which 


(5.2) lim sup n (log ny 4! deg > 8 almost surely. 
Nn > oo 


GREEDY LATTICE ANIMALS 631 


Let Ap = inf(4 € R:P(Xo > —A) > pc}. Recall from after (2.21) that, for A > Ao, 
we denote by W the unique infinite component of A-white sites in ZZ. 


DEFINITION 5.1. ForA >Ao,p >0,n €Nandxe Z4 , the event E(x,n, X, p) 
occurs if there exists y € *A,,, such that S(y) = Gp, > ón(logn)^/V Ute, 
Iy| = Lg, > 8 (log n)7/(4-D-s, and a site v € y N W satisfying D(u, v) < 
poo (u, v) for each u e W N By ,[1]*. 


LEMMA 5.1. Ford, p sufficiently high, 
{Gn > ón(logn)"/€ D** occurs for infinitely many n} 
= (E(0, n, à, p) occurs for infinitely many n), 


up to a set of measure zero. 


PROOF. We must show that, for high enough values of n, Gn > ôn x 
(log 1)2/4-D-*e implies the occurrence of E(0, n, à, p) for given choices of A 
and p. As noted before (2.19), X, > ||v|| for at most finitely many v € Z? almost 
surely, so that G, > ón(log nyf/(d—L-e implies that 


(5.3) Ln > (logn) 4 De 


for high n. It follows from (2.25), written with p’ in place of p, and the Borel— 
Cantelli lemma, that for any p’ > d/(d — 1), and for all n sufficiently high, each 
y € Ap, satisfying |y] > (log n)? intersects W. [Note that we require À cho- 
sen high enough that (2.25) may be applied.] By (5.3), each greedy lattice ani- 
mal in B, intersects W for all high n. Let y be a greedy lattice animal in Bn, 
with n chosen to be high enough that we may locate a site v € W N y. If a site 
u € B,[1]* N W satisfies D(v, u) > ploo(v, u), then D(v,u) > (0/4) llull, since, 
setting c = (1n/2], ..., [n/2]), 
Coo(u, V) = £o, (u — e, v - c) > |u — el] — |v — el > lu — ell — 1n/2] — 1 
> $lu — el — 1 > 3 (lull — llel) — 1 > gull — 1, 


the third inequality valid by ||u — c|| > 3n/2, and the fifth by |u|| > n > 2||c]. 
By Lemma 2.14 of [3], with o set equal to 4o(p, d), and a union bound, the 
probability that such a site u exists is at most exp{—cn}, for some positive 
constant c. The Borel-Cantelli lemma implies that each u € B,[1]* N W satis- 
fies D(u, v) < p£o.(u, v), provided that n is high enough. We have shown that 
Gn > án (log n)7/ (d—1)+e implies the occurrence of E(0, n, à, p) for high values 
of n, as required. LJ 


Defining the event D(x, m, à, p, C), for Xx € Z4 and C > 0, according to 
D(x,m, à, p, C) 
= {dy € Ap, m V E Y iV is A-white, S(y) 2 Cm, 


D(v, u) < pm for each corner u of By m}, 


632 A. HAMMOND 


we will now show that for any €9 > 0, we may choose A, p and C sufficiently high 
that 


(5.4) IP(D(0, m, À, p, C)) = 1 — £0, 
for high values of m. Let c4 € (0, co) and No € N be chosen so that, for n > No, 


yt/(d- 1e .. 3(2c1 + (2d — 1)Ap) 
6(1 — £0/2) 
By Lemma 5.1 and (5.2), we may fix N; > No for which 


(5.5) (logn 


(5.6) . P(E(0, n, A, p) occurs for some n € (No, ..., Ni]) > 1— £6 /4. 


Declare any site x € ZÍ to be full if the event E(x,n, À, o) occurs for some 
n € (No, ..., Ny}. To any full site x, we may associate a lattice animal yx, a site 
Vx € yx and the box Ty = By n,, these objects arising from the definition of the 
event E(x, nx, À, p), for the minimal ny € (No, ..., Ny} for which this event oc- 
curs. Allowing ej to denote the unit vector (1,0,...,0), we set x; = je, for 
j € N. We now form a subsequence (yj: j € N} of the sequence (x;: j e N}. 
The first element y; is taken to be xj, where j is the lowest natural number for 
which x; is a full site. Having constructed an initial segment of the y-sequence, 
{y;: j € (b ..., K}}, say, we set yx41 equal to the lowest-labeled site in the 
x-sequence which is full and has e;-coordinate exceeding that of any site lying 
in the box Dy, [1]. 

Noting that vy,,, € Cy; [1], it follows from the definition of the event E(yi, ny,, 
à, p) that we may join vy, and vy,,, by a path v; in W of length at most 
D Eoo(Vy, , Vy; ,,). For each J € N, form the animal 


KJ —yy U TI U Yy, UMU---Uty_1 Uy. 


Let & be the collection of sites x; that are not full and that lie between Ty, [1] and 
yj+1 for some j € N, or before yj. Writing H = (DE mel > £9/2), we claim 
that, for any m EN, 
(5.7) P(H^) > 1 — 20/2. 


To see this, we perform an experiment in which we sample z € {1,..., m} uni- 
formly at random, and ask whether the site x, is full. If P(H*) < 1 — £ọ/2, then 


P(x, is not full) > P(x, is not full| H)P(H) > 62/4; 


however, P(x, is not full) is the probability that a given site is not full, contradict- 
ing (5.6) and establishing (5.7). 

For m € N, let J(= J(m)) be maximal such that l';,[1] has maximum 
e;-coordinate at most m — 1. Let us estimate the weight S(x7) of the animal «yj, 


GREEDY LATTICE ANIMALS 633 


for fixed m. The animals y; are disjoint for distinct j, and the paths t lie in 'W. 
Thus, 


J J—I 
(5.8) Sey) > X Sly) - X DS Il. 
j=l j=l 


In bounding the first term on the right-hand side of (5.8), note that, for j € 
{1,..., J}, we have that 


d/(d—1 
S(vy;) > óny, (log ny, ) Parse. 


Since ny, > No for any such j, from (5.5), it follows that 


d 3(2c1 + Qd —1)pa) d 
5.9 Se Se . 
idi > Us) 2 1 — 69/2 dm 


To bound from below the quantity M Ay, , note the following inclusion: 


J 


(5.10) —— xo... Xm} € (U Py tt) ) Us UR, 


j=l 


where the set R denotes the final 2N, — 1 sites of the interval [X1,..., Xi, 1], 
and appears because of the possibility that the site y ;..; lies in this interval. From 
(5.10), it follows that, on the event H*, 


J 
m(1—€&9/2) 2N,-1 
5.11 SS ee ies a —— ] 
( ) Lm — 3 a 


We must also bound from above the quantity ae z : |t;|. Note that 


J=1 J—1 d 
2: loo(Vy;» Vyi41) € > IW, E LM | 
i=] i-1 /=1 
1 J-1 

« vi. 5 VN + (d — 1) a max [ny, , ny, .,] 
(5.12) = 

< J |v, - vi, | +2d-— 1) $ ny; 

i=l i=] 

< Qd — 1)m, 
where, in the second inequality, we used the fact that, for each i € N and 
Le (2: sdb 


z = n | = max [ny, s Ayia) Ji 


634 A. HAMMOND 


while in the fourth, we used the bounds 
z J 
2. Ivy — vul zm and ny xm. 
i isl 
(In the first of these, we used the fact that vi, is increasing, which is true because 
the boxes l'y; are disjoint. The second also uses this disjointness.) From (5.12) and 
[ti | < 0£co(Vy,; Vy; ,;), it follows that 
J-1 
(5.13) Š itj] x Qd — Dem. 
j=l 
Substituting the bounds (5.9) and (5.13) into 8) yields 
3Qc + Qd — 1)pa) < 


S(kj) = "Ima £i Ny, — Ap(2d — 1)m. 


Substituting (5.11) into this inequality, we see that, for high values of m, 
(2N1 — 1)2c1 + Gd — 1)pX) | 


H* C 1S(xj) > 2c1m — 
c | (ky) = 2c1m 1 29/2 


(5.14) 
C (S(«7) = eim). 

We now claim that 

(5.5) HSC {ky € {v € ZÍ : £s (v, Bm) < p(meo/2 + (d -- 2)N1)]]. 


To show (5.15), note that yy ; & Ty; G Bm for each j € {1,..., J}, the latter inclu- 
sion valid by the definition of J = J (m). For j € {1,..., J — 1} and ve vj, 
(5.16) LoolV, Bm) < |vjl < PN Vyj41) 

< p(lvy, — vy... (d — 1) max{ny,, ny,,,}), 


the third inequality following PRW to (5.12). Note that 
(5.17) vy, 7 Vy il SIE] + 2ny; + nya 


Na 


because vy 
dJ 


vy, is at most ny jı More than yj, the minimum e;-coordinate of the box Ty is 
while each site x; = je; for which j lies strictly between this maximum and 
this minimum belongs to &. Given that |5| < meo/2 on the event H*, and that 
max(ny,, ny;,,) < Ni, we find that (5.16) and (5.17) imply (5.15). 

By (5.14), (5.15) and the bound IP(H*) > 1 — £o/2, we find, provided that £o 
has been chosen so that £o < 20^ |, that, for m sufficiently high, 


P(dy € Ap,np vey 1 W:S(y) z eum, 
(5.18) ue WN {x eE Zs Loo (X, By) > m/2) => D(v,u) < p£oo(v, u)) 
-1— 9/2, 


is at most 2ny, less than the maximum e; -coordinate of the box Iy jl, 


GREEDY LATTICE ANIMALS 635 


the role of v in (5.18) being played by any vy, for i € {1,..., J}. (We are using the 
fact that By, ny, [1] C (x € Z4 : £o (x, Bm) < m/2), which is implied, provided that 
m > 2N1, by By,,ny, € Bm and Nj > ny,.) Note also that 


(5.19)  P(each corner of B,,[1] lies in W) > 1 — 27IP(0 £ W) > 1 — zo/2, 


since (2.31) permits us to choose A € R so that the second inequality is valid. 
By (5.18), (5.19) and the translation invariance of the process {Xy:v € Z4), we 
find that, for m large and divisible by 3, 


P(3y e Ap, v ey 1 W:S(y) = (c1/3)m, 
D(v, u) x pm for each corner u of Bm) > 1 — £o. 


The condition that m is divisible by 3 occurs because the sidelength of the 
box B,,[1] must satisfy this. It may be dropped by replacing B,,[1] in (5.19) by 
a box that extends by one or two sites further on one half of its faces. Writing 
cı = 3C, we have shown (5.4). 

By the proof of Lemma 2.5, the process 


(5.20) (ac Z4: D(ma, m, X, p, C) occurs} 


is a (2p 4- 1)-near percolation, for any given m € N. By (5.4), Lemma 2.6 
and (2.31), we may fix 4,0, C > 0 and a high value of m so that there ex- 
ists a subset P of (5.20) that is a percolation of supercritical parameter. Let 
la; :i € N} denote an infinite self-avoiding path in P. Note that, for each i € N, 
D(Ya; + Yai) € 2pm + 1, where ya © Bmajm and Va € Ya denote the lattice ani- 
mal and the site therein resulting from the occurrence of D(ma, m, A, p, C): the 
inequality is due to D(ya;, yaj,,) < D(Va;, W) + D(w, Va; 41) +1, where w, w are 
adjacent corners of Bma;,m and Bma;,,,m- Let 6; denote a white path of length at 
most 2pm + 1 from ya; to ya;,,. Let $9 denote an arbitrary path from 0 to ya,. 
We form an increasing sequence of lattice animals (R; :i € N}, satisfying |R;| =i, 
with Ro = {0}, which successively collect the sites of the path $o, then the animal 
Ya; and the path $;, for each i € (1,2, ...) in turn. 
Let n; —inf(j € N: Rj D ya, }. Then 


ju] 
Rn; = go U U (Ya, U dj) U Ya;- 
j=l 
We find that 
i = 
S(Rn;) > 35 S(vaj) — So) — ^ Y (1651 2) 
j=l j=l 


> Cmi — (2pm — 1) — 1) — S(do), 


(5.21) 


636 A. HAMMOND 


where the first inequality follows from the animals ya j G Bma;,m being disjoint 


and the paths ¢; being white, their endpoints lying in ya; or ya;,,. Note also that 


(5.22) nj < |$o| - mi + Qpm — 1)(i — 1), 


since |ya;| < m^ and |¢a,| — 2 < 2pm — 1. By (5.21), (5.22) and |Rn,| = ni, we 
obtain 
Rn; = = 
lim inf S(Rn;) " Cm — X(2pm — 1) 
imc hj mî -+ 2pm —1 


0, 


provided that the constant C is chosen so that C > 2Ap. Thus, on our hypothe- 
sis (5.1), N > 0, by the definition of N. This completes the proof. 


Acknowledgments. Iam grateful to Amir Dembo for introducing the subject 
of greedy lattice animals to me, and for numerous helpful discussions during the 
course of this project. I thank him in particular for postulating Theorem 1.3 and its 
proof by means of Proposition 1.5. Thanks to Peter Teichner for help in proving 
Lemma 2.1, and to Gábor Pete, Yuval Peres and a referee for useful comments on 
a draft version of the paper. 


REFERENCES 


[1] ANTAL, P. and PISZTORA, A. (1996). On the chemical distance for supercritical Bernoulli 
percolation. Ann. Probab. 24 1036-1048. MR1404543 
[2] Cox, J. T., GANDOLFI, A., GRIFFIN, P. S. and KESTEN, H. (1993). Greedy lattice animals I: 
Upper bounds. Ann. Appl. Probab. 31151-1169. MR1241039 
[3] DEMBO, A., GANDOLFI, A. and KESTEN, H. (2001). Greedy lattice animals: Negative values 
and unconstrained maxima. Ann. Probab, 29 205—241. MR1825148 
[4] DURRETT, R. (1996). Probability: Theory and Examples, 2nd ed. Duxbury, Belmont, CA. 
MR1609153 
[5] GANDOLFI, A. and KESTEN, H. (1994). Greedy lattice animals II: Linear growth. Ann. Appl. 
Probab. 4 76-107. MR1258174 
[6] GRIMMETT, G. (1999). Percolation, 2nd ed. Springer, Berlin. MR1707339 
[7] HUGHES, B. D. (1996). Random Walks and Random Environments. 2. Random Environments. 
Oxford Univ. Press. MR1420619 
[8] KESTEN, H. (1982). Percolation Theory for Mathematicians. Birhháuser, Boston. MR0692943 
[9] KESTEN, H. (1987). Aspects of first passage percolation. Lecture Notes in Math. 1180 125—264. 
Springer, Berlin. MR0876084 
[10] KESTEN, H. and ZHANG, Y. (1987). Strict inequalities for some critical exponents in two- 
dimensional percolation. J. Statist. Phys. 46 1031—1055. MR0893131 
[11] HAMMOND, A. M. (2004). Greedy lattice animals: Geometry and criticality (with an Appen- 
dix). Available at www.arxiv.org/math.PR/0411459. 
[12] HiRSCH, M. W. (1976). Differential Topology. Springer, Berlin. MR0448362 
[13] LEADER, I. (1991). Discrete isoperimetric inequalities. Proc. Sympos. Appl. Math. 44 57-80. 
MR1141923 
[14] LEE, S. (1993). An inequality for greedy lattice animals. Ann. Appl. Probab. 3 1170-1188. 
MR1241040 
[15] LEE, S. (1997). The continuity of M and N in greedy lattice animals. J. Theoret. Probab. 10 
87-100. MR1432617 


[16] 
[17] 
[18] 
[19] 
[20] 
[21] 


[22] 


GREEDY LATTICE ANIMALS 637 


LEE, S. (1997). The power laws of M and N in greedy lattice animals. Stochastic Process. 
Appl. 69 275-287. MR1472955 

LIGGETT, T. M., SCHONMANN, R. H. and STACEY, A. M. (1997). Domination by product 
measures. Ann. Probab. 25 71-95. MR1428500 

LOOMIS, L. H. and WHITNEY, H. (1949). An inequality related to the isoperimetric inequality. 
Bull. Amer. Math. Soc. 55 961—962. MR0031538 

MARTIN, J. B. (2002). Linear growth for greedy lattice animals. Stochastic Process. Appl. 98 
43-66. MR1884923 

MATHIEU, P. and REMY, E. (2004). Isoperimetry and heat kernel decay on percolation clusters. 
Ann. Probab. 32 100-128. MR2040777 

MENSHIKOV, M. V. and ZUYEV, S. A. (1992). Models of p-percolation. In Petrozavodsk Con- 
ference on Probabilistic Models in Discrete Mathematics 337—347. MR1383148 

PENROSE, M. D. and PISZTORA, A. (1996). Large deviations for discrete and continuous per- 
colation. Adv. in Appl. Probab. 28 29-52. MR1372330 


DEPARTMENT OF MATHEMATICS 
1984 MATHEMATICS ROAD 
UNIVERSITY OF BRITISH COLUMBIA 
VANCOUVER, BRITISH COLUMBIA 
CANADA V6T 1Z2 

E-MAIL: alanmh Q math.ubc.ca 


The Annals of Probability 

2006, Vol. 34, No. 2, 638-662 

DOE: 10.1214/009117905000000738 

© Institute of Mathematical Statistics, 2006 


WIENER CHAOS SOLUTIONS OF LINEAR STOCHASTIC 
EVOLUTION EQUATIONS 


By S. V. LOTOTSKY! AND B. L. Rozovs&ii? 
University of Southern California 


A new method is described for constructing a generalized solution of 
a stochastic evolution equation. Existence, uniqueness, regularity and a prob- 
abilistic representation of this Wiener Chaos solution are established for 
a large class of equations. As an application of the general theory, new re- 
sults are obtained for several types of the passive scalar equation. 


1. Introduction. Consider a stochastic evolution equation 
(1.1) du(t) = (Au(t) + f (0)) dt + (Mult) + g(t)) AW (1), 


where A and M are differential operators, and W is a cylindrical Brownian mo- 
tion on a stochastic basis F = (Q, F, {F;}:>0, P). Let Ag and Mo be the leading 
(highest-order) terms of the operators A and M, respectively, and 


t 
Z,(u) = Í (Mus) + g(5)) QW (s). 
Traditionally, (1.1) is studied under the following assumptions: 


(1) The operator g—3 MoM is elliptic. 
(ii) The noise term Z; (v) is sufficiently regular. More specifically, for a suitable 
function space X, E b | 25: (v) 2 dt < oo for all v in a dense subspace of X. 


Under these assumptions, there exists a unique Itó (strong) solution or a mar- 
tingale (weak) solution u of (1.1) so that u € L2(Q x (0, T); X) for T > 0 (see, 
e.g., [5, 19, 33, 35]). In the future, we will refer to such solutions as traditional or 
square integrable solutions. 

There are important examples demonstrating that the assumptions (1)- (i1) are 
necessary for the existence of a square integrable solution of (1.1). 

In particular, it was shown in [30] that the stochastic advection-diffusion equa- 
tion 


o . 
a, (x) = Aul, x) Fut x)W(x), x ER, 


Received July 2004; revised December 2004. 
l Supported by a Sloan Research Fellowship, NSF Career Award DMS-02-37724 and ARO Grant 
DAAD19-02-1-0374. 
2Supported by ARO Grant DAAD19-02-1-0374 and ONR Grant N0014-03-1-0027. 
AMS 2000 subject classifications. Primary 60H15; secondary 35R60, 60H40. 
Key words and phrases. Feynmann-Kac formula, generalized random elements, stochastic par- 
abolic equations, turbulent transport, white noise. 


638 


WIENER CHAOS SOLUTIONS 639 


where W (t, x) is the space-time white noise, has no square integrable solutions in 
any Sobolev space X = H?(R^), s € R, if d > 2. In this case assumption (i) holds 
but assumption (ii) does not. 

Simple calculations show (see, e.g., [35]) that 

9^ ð 
—-u(t,x)dt 4- o —u(t, x) dw(t), 

0x? Ox 

where w(t) is a one-dimensional Brownian motion and 4(0, x) is square inte- 
grable, has no square integrable solutions unless a? — lo? > 0, that is, unless (i) 
holds. Obviously, (i) implies that the order of the operator M must be no larger 
than a half of the order of Æ. Moreover, this assumption is, in some way, counter- 
intuitive. Indeed, in the deterministic theory, only the highest-order operator (Ao, 
in our case) has to be elliptic. 

The objective of the current paper is to study stochastic differential equations 
of the type (1.1) without the restrictive assumptions (i) and (ii). The basic idea can 
be described as follows. If (1.1) does have a sufficiently regular solution, this so- 
lution can be projected on an orthonormal basis in some Hilbert space, resulting in 
a system of equations for the corresponding Fourier coefficients. We now turn this 
argument around, and define the solution of (1.1) as a formal Fourier series with 
the coefficients computed by solving the corresponding system. It often happens 
that this system has a solution under more general conditions than the original 
equation. | 

This approach could be traced to the classical separation of variables ideas 
in PDE’s. For example, the Navier-Stokes equation is often defined as a system 
of coupled ODE’s for the modes of its formal Fourier expansion with respect to 
the spatial variables (see, e.g., [7] and [26]). Similarly, in our case, the nonrandom 
spatiotemporal variables (x, f) are being separated from the “random variable" 
(Brownian motion). 

More specifically, the traditional solution of (1.1) is an Fp -measurable square 
integrable random variable taking values in a Hilbert space X. A classical result 
by Cameron and Martin [4] provides an orthonormal basis & = (55, a € $} in the 
Wiener Chaos space L2(W; X), where 4 is the set of multi-indices a = {orf } of 
finite length |æ| = b» ot . Accordingly, a Wiener Chaos solution of (1.1) is de- 
fined as a formal Fourier series with respect to the Cameron-Martin basis &. By 
construction, this solution is strong in the probabilistic sense, that is, uniquely de- 
termined by the coefficients, free terms, initial condition and the Wiener process. 
The coefficients in the Fourier series are computed by solving the correspond- 
ing propagator, a lower-triangular system of deterministic parabolic equations, 
uniquely determined by (1.1) (in earlier works, e.g., [28], the propagator is referred 
to as S-system). 

Of course, unless both (i) and Gi) hold, one could not expect the resulting 
Fourier series to converge in L2(W; X). However, we demonstrate that under quite 
general assumptions, the Fourier series converges in a "minimal" weighted Wiener 


du(t, x) — a^ 


640 S. V. LOTOTSKY AND B. L. ROZOVSKII 


Chaos space L2 o (W; X). The construction of this space is quite simple (cf. [30]). 
Given a sequence of positive numbers Q = (41, q2,...}, we define Lz, g(W; X) as 
the collection of sequences u = (uo, a € J} with uy € X so that 


lulz, ocw:x) = 2,4 tall < oo. 
aca 


where 
k 
Qt 
q" = La ; ae g. 
Lk 


We remark that the space L2(W; X) is a very special example of the sequence 
spaces studied in [15-17]. 

Of course, the Wiener Chaos solution is a weak solution designed for dealing 
with equations that do not have square integrable solutions. However, in some 
problems, the Wiener Chaos solution serves as a convenient first step in the inves- 
tigation of square integrable solutions. For example, this is the case in the proof 
of the existence of square integrable solutions of degenerate parabolic SPDE's 
(Corollary 4.2 and Theorem 6.3). 

Constructions based on various forms of the Wiener Chaos decomposition are 
popular in the study of stochastic differential equations, both ordinary and with 
partial derivatives. For stochastic ordinary differential equations, [20] used mul- 
tiple Wiener integral expansion to study Itó's diffusions with nonsmooth coeffi- 
cients. More recently, LeJan and Raimond [23] used a similar approach in the 
construction of stochastic flows. Various versions of the Wiener Chaos appear in 
a number of papers on nonlinear filtering and related topics; see, for example, 
[3, 24, 28, 31, 37]. The book by Holden et al. [12] presents a systematic approach 
to the stochastic differential equations based on the white noise theory; see also 
[11, 34] and the references therein. 

The propagator was first introduced by Mikulevicius and Rozovskii [27], and 
further studied in [24], as a numerical tool for solving the Zakai filtering equa- 
tion. In [30], the propagator was used to construct a generalized solution of the 
reaction-diffusion equation driven by the space-time white noise in several space 
dimensions. A similar system can be derived for certain nonlinear equations, such 
as the stochastic Navier-Stokes equation [29]. 

The main results of the paper are: 


(1) Existence, uniqueness, regularity and the Krylov—Veretennikov formula for 
the Wiener Chaos solution of (1.1) (Theorems 3.4, 4.1 and Corollary 4.2). 

(2) A Feynman-Kac formula for Wiener Chaos solutions in L2 o (W; L2 (IR7)) 
(Theorem 5.1). 

(3) Existence, uniqueness and regularity properties for the transport equation 
with: 


(a) a space-time white noise-type velocity field (Theorem 6.1); 


WIENER CHAOS SOLUTIONS 641 


(b) an incompressible Kraichnan turbulent velocity field, the nonviscous 
case (Theorem 6.3). 


The paper is organized as follows. Section 2 introduces the Cameron—Martin 
basis and the weighted Wiener Chaos spaces. Section 3 presents the general defi- 
nition of the Wiener Chaos solution and establishes the connection with the tra- 
ditional and the white noise solutions. In the three main Sections, 4, 5 and 6, 
three types of stochastic equations are studied under three different sets of assump- 
tions; these assumptions are listed at the beginning of the corresponding section. 
Section 4 presents the basic existence/uniqueness/regularity result for an abstract 
stochastic evolution equation under assumptions (A1)-(A3). Section 5 establishes 
probabilistic and multiple Wiener integral representations of Wiener Chaos solu- 
tions for stochastic partial differential equations in R? with nonrandom coefficients 
under assumptions (B0)-(B5). Finally, Section 6 illustrates the results of the pre- 
vious two sections for several versions of the turbulent transport equation under 
assumptions (S1)—(S3). 

The following notation will be in force throughout the paper. A is the Laplace 
operator, D; = 0/dx;, i = 1,...,d, and summation over the repeated indices is 
assumed. The space of continuous functions is denoted by C, and HJ (R^), y € R, 
is the Sobolev space 


| f: Í , | FOPA +p dy < oo}, where f is the Fourier transform of f. 


2. Weighted Wiener Chaos spaces. In this section we review the construc- 
tion of the Cameron-Martin basis and define the spaces of generalized random 
elements. 

For a fixed T > 0, let F = (2, F, {F;}0<:<7, P) be a stochastic basis with the 
usual assumptions and let W = (wj = wy(t),k > 1,0: x T) be a collection of 
independent standard Brownian motions on F. Denote by F,” the o-algebra gen- 
erated by the random variables [wi(s), k > 1,5 x t), and by L2(W), the Hilbert 
space of FP -measurable square integrable random variables. 

Our first step is to construct the Cameron—Martin basis, a special orthonormal 
basis in the space L2(W). 

Let m= {m x, k > 1) be an orthonormal basis in L2((0, T)) so that each function 
my = my(t) is bounded for t € [0, T]. Given such a basis m, define independent 
standard Gaussian random variables 


T 
S Í m; s) dux (s). 


Consider the set of multi-indices 


q= fa = etii 1), af € {0,1,2,...}, X of zl 
ik 


642 S. V. LOTOTSKY AND B. L. ROZOVSKII 


The set J is countable, and, for every a € 4, only finitely many of ot are not equal 
to zero. For a € J, we write 


laj= a, al=] oft, 
i,k i,k 


and define the collection & = {&,a@ € 4} of random variables so that 


(2.1) = 7a [He t (Eix), 


where 
25d" 2 
H,(t) = (—1)"e! 7. t^/2 
is nth Hermite polynomial. 


THEOREM 2.1. The collection & = {&,a € J} is an orthonormal basis 
in L4(W). 


PROOF. This is a version of the classical result by Cameron and Martin [4]. 
CJ 


Using the Cameron-Martin basis 5 in L2(W), we now define the space of gen- 
eralized random elements. 

Let X be a Banach space and Q = (41, q2,...}, a sequence of positive numbers. 
Define 


af 
(2.2) q-]le. e$. 
i,k 


DEFINITION 2.2. (i) The Q-weighted Wiener Chaos space L5, o (W; X) is the 
collection of sequences u = (us, o € J} with ug € X so that 


2 "T 2a 2 
lulz, ocw:x) = 2., 4^ lla ll < oo. 


aeg 
(ii) The Q^ -weighted Wiener Chaos space Lz 9-(W; X) is the collection of se- 
quences u = (ug, o € J} with ug € X so that 


2 "ET —2a 2 
lulz, X) = 2,4 — tal < oo. 


acá 


The space X will be omitted from the notation if X — iR. The sequence Q will 
be omitted from the notation if Q = 1, that is, if gz = 1 for all k. 


WIENER CHAOS SOLUTIONS 643 


Given a u = (ug, a € J} € L2, 0 (W; X), we call each ug a generalized Fourier 
coefficient of u and identify u with a formal Fourier series 


u = » Lgsa. 


acá 


The members of the set L2 9 (W; X) are called X-valued generalized random 
elements. Similarly, the members of the set Lo g(W; L,1((0, T); X)) are called 
X-valued generalized random processes. 

For u € L2, 9 (W; X) and v € L2, 9-(W) we define 


(2.3) (u, v) = Ý Uata; 


acá 


the series in (2.3) converges in the norm of X by the Cauchy—Schwartz inequality. 


3. Linear stochastic evolution equations and the propagator. In this sec- 
tion we define the Wiener Chaos solution and establish its connection with the 
traditional and white noise solutions. To motivate the definition, we start by re- 
viewing the main results about the traditional solution. 

Let (V, H, V^) be a normal triple of Hilbert spaces so that V C H C V” with 
both embeddings continuous; for the complete definition of the normal triple see 
Section 3.1 in [35]. Denote by (v', v), v' € V’, v € V, the duality between V and V’ 
relative to the inner product in H. 

For t € [0, T], consider families of linear operators A = A(t) and My = Mx(t) 
so that, for each t, the operators A(t): V — V’, Mz(t): V — H are bounded. 
Consider the following equation: 


Í t 
(3.1) 4(D = uo + Í (Au(s) + f(s)) ds + Í (Mxu(s) + gx(s)) dwy(s), 
0<t<T. 


Recall that summation convention over the pairs of repeated indices is in force. 
We proceed with a review of the traditional approach. Assume that, for all 
ve V,re[0,T], 


(3.2) 3. M QOvIIg < oo, 
k>1 


and the nonrandom input data ug, f and gg satisfy 


T T 
(3.3) Iul +f fel de 7 f Mail dt < oo. 


kz1 


644 S. V. LOTOTSKY AND B. L. ROZOVSKII 


DEFINITION 3.1. An F,” -adapted process u € L2(W; L2((0, T); V)) is 
called a square integrable solution of (3.1) if, for every v € V, there exists a mea- 
surable subset Q’ of 2 with P(Q’) = 1, so that, for all 0 < t < T, the equality 


t 
(u(t), v) y = (uo, vw + f (Au(s) + f (s), v) ds 
(3.4) 


53 [ (Myu(s) + g(s), v) y dws(s) 


k>] 
holds on Q’. Similarly, u € L2((0, T); V) is a solution of the deterministic equa- 
tion 
f t 
u(t) = ug + Í Au(s) ds + f (s)ds 
if, for every v € V and t € [0, T], equality (3.4) holds with Mg = gg =Q. 


Existence and uniqueness of the traditional solution of (3.1) are established un-: 
der an additional assumption about the operators Æ and Mx. 


DEFINITION 3.2. Equation (3.1) is called strongly parabolic if there exists 
a positive number & and a real number Co so that, for all v € V and t € [0, T], 


(3.5) 2(A(t)v, v) + Y IM (Dull + elol < Collull%. 
k-1 


Equation (3.1) is called weakly parabolic (or degenerate parabolic) if coneis 
tion (3.5) holds with ¢ = 0. 


THEOREM 3.3. Jf (3.3) holds and (3.1) is strongly parabolic, then there exists 
a unique square integrable solution of (3.1). The solution process u belongs to 
L2(W; L2((0, T); V) A L2(W; C((0, T), H)) 


and satisfies 


E( sup lut) uf luco d) 
(3.6) 
< C(Co, €, D (wii + f irolpa- Y [. ifia. 


k>1 
PROOF. ‘This follows, for example, from Theorem 3.1.4 in [35]. O 


When (3.1) is weakly parabolic, then the solvability result is somewhat differ- 
ent; see Section 3.2 in [35] for details. 

As an element of the Hilbert space L2(W; L5((0, T); V)), the traditional solu- 
tion of (3.1) admits a representation u(t) = X ye g Ua (f)5s in the Cameron-Martin 
basis &. 


WIENER CHAOS SOLUTIONS 645 


THEOREM 3.4. An F,” -adapted process u is a square integrable solution 
of (3.1) if and only if u(t) = Voeg us (t)&g so that the Fourier coefficients ua 
satisfy 


f T 
3.7 «(12 dt + a (t is) 
(3.7) X lus OM d+ sup. Jus (Ir) < co 


and solve the propagator 


t 
a(t) = uol (la| = 0) + i (Aug s) + f ()1 (la] = 0)) ds 
(3.8) 


t 
s J, Y yot (Mug qp) GS) + g(s) Cor] = 1))mi s)ds, 
i,k 


where a` (i, k) is the multi-index with components 


i max(a* — 1,0), ifi-jandk-l,- 
(o y [^ | 


p otherwise. 


Before presenting the proof, we define the Wiener Chaos solution of (3.1). The 
definition is motivated by Theorem 3.4. 


DEFINITION 3.5. A V-valued generalized random process u is called a 
Wiener Chaos solution of (3.1) if the generalized Fourier coefficients ug,a € J, 
of u are a solution of the propagator (3.8). 


To prove Theorem 3.4 and to derive an alternative characterization of the Wiener 
Chaos solution, we need a few additional constructions. 
Denote by #€ the set 


H = |] Los ((0, T); R”). 


n-1l 
If h € H, then there exists an N > 1 so that 
h= (hi, .... hw), 
with each hy € L55((0, T)). We define 


e f f* T is 2 
el, ee 3 ( Í hy (s) dwy(s) — 4 [ Ine(s)| is)).. he 3t, 
(3.9) E(h) = &(T,h), 


T 
bis f hy(t)m;(t)dt, miem, 


646 S. V. LOTOTSKY AND B. L. ROZOVSKII 


and 
" 
h =|[hjj, ae J. 
Lk 


The following properties of the aie € (t, h) are verified by direct calculation: 


(3.10) &(h) = Y 
aca Va! 
(3.11) E(t,h) =1+ f. & (s, h)hy (s) dwy (s). 


One Ce of (3.10) is that (A) € Lo, o(W) for every weight sequence Q, 
with ||& (A) ||? Lj, 98) = = exp(Y 2 qz|hk l7. cto, 7)- Another consequence of (3.10) 
is an alternative representation of Eg: 

1 gle! 
Jal ane 


REMARK 3.6. By Lemma 4.3.2 in [32], the family {€ (A), h € J€) is dense in 
L2(W) and therefore in every L2,9(W). 


(3.12) Tem ed 


PROOF OF THEOREM 3.4. (i) Assume that u = u(t) is a traditional solution 
of (3.1). Equation (3.12) implies 


Ug(t) = E(u(t)&g) = a” E(u (t)&(h)) 


E aie h=0 


Using the F,” -measurability of u(t) and the martingale property (3.11) of &(t, h), 
we derive 


E(u(t)€(h)) = E(u(t)E(6 (h)] F,”)) = E(u(t)& (t, h)), 


and (3.8) follows after applying the Itó formula to the product u(t)& (t, h) and dif- 
ferentiating the resulting equation with respect to A. By Theorem 3.3.3(iii) in [22], 
this differentiation with respect to / is justified. 

The same arguments show that the time evolution of 


(3.13) Ea (t) :— Elkal F,” ) 
is described by 


, 
(3.14) a(t) = I (lal -0) + I Y Vakta- ao G)miG) duy). 
ik 


(ii) Conversely, assume that condition (3.7) holds. Then the process u(t) = 
F oeg Ha (t)Ex satisfies 


u € Lo(W; L2((0, T); V)) N La(W; C(O, T); H)), 


WIENER CHAOS SOLUTIONS 647 


and the function up = E(u&(h)) belongs to L2((0, T); V)'YC((0, T); H). By Par- 
sevall's identity and relation (3.10), 


m= 


acg Val 


Using (3.8), we conclude that, for every 9 € V and k € Jt, the function uy, satisfies 


(un (t), 9) = ne (Aun(s),9) ds | (f (s), p) ds 


pri a Ioi JafmiGY( ((Mita- Gi) 5), 9) n 


aeg 


+ (gx(s), Q) 1 Cla = 1))ds. 


To simplify the above equality note that if Z (£) = ho (M,u(s), e) ui dwy(s), then 
the F,” -measurability of Z (t) and relation (3.14) imply 


t 
G15) EG(0&) = EG Oa) = | Y Vami 6) (Muai 9). 9) ds. 
Lk 


Similarly, 
(s f 4o) Y [Veni (s)(ge(s), @) Tel = Das. 

Therefore, 
a A ork m;(s) (( (Mittag, lS), 9) g + (0k G); P) y Hoi — )ds 


7 E(eq) f Gna a )). 


As a result, 


i 
E(8(h) (u(t), 9) y) =E (uo, 9) n) + (eu; Í (Aus), 9) ds 
t 
(3.16) +E(8() Í (Fo) g)ds) 


t 
+ E(e) Í (Mus), 0) 7 + (e), 9) gi) dw). 


Equality (3.16) and Remark 3.6 imply that, for each t and each q, (3.4) holds with 
probability 1. Due to continuity of u, a single probability-1 set can be chosen for 
all 1 € [0, T]. 

Theorem 3.4 is proved. LJ 


648 S. V. LOTOTSKY AND B. L. ROZOVSKII 


Theorem 3.4 implies that, if it exists, the traditional solution of (3.1) coincides 
with the Wiener Chaos solution. 
We now give an alternative characterization of the Wiener Chaos solution. 


THEOREM 3.7. A V-valued generalized random process u is a Wiener Chaos 
solution of (3.1) if and only if, for every h € H, the function uy (t) = ((u(t), &(h))) 
is a solution of the equation 


t 
(3.17) ug(t) — ug | (Aun (s) + f (S) + he(s)Myun(s) + Ag (s)ge(s)) ds. 


PROOF. Assume that u is a Wiener Chaos solution of (3.1), that is, u € 
L2,9(W; V) for some Q and each ua is a solution of (3.8). By definition (2.3) 
and relation (3.10), 


he 
(3.18) uh = Y La. 
aE gs Va! 
Then (3.17) follows from (3.8). Indeed, 
h he (i,k) 


[41 
2, al 2. Jok Myus-q mi = 5 > Ja door o omibik 


œ i,k 





h 
a 


i,k 
Computations for the other terms are simular. 


Conversely, assume that u € Lo, o(W; L2((0, T); V)) and uj € L2((0, T); V) 
is a solution of (3.17). By relation (3.18), 





1 9h! 
Ua = ———u : 
* — Jal 0h* í h=0 
Then term-by-term differentiation in (3.17) implies (3.8). Theorem 3.7 is proved. 


L] 


To conclude this section, we briefly discuss the relation between the Wiener 
Chaos and white noise solutions. While the white noise solution for (3.1) has not 
been defined in general, analysis of the particular cases [11, 12, 34] shows that 
the white noise solution exists provided (3.17) has a solution that is an analytic 
function of A. The white noise solution of (3.1) is constructed as an element of the 
space 


(48).,(X) = le: `> rz lla |< < oo for some £ > ol, 


aed 


WIENER CHAOS SOLUTIONS 649 


where 


r2 = (al)? T [(2iky 5, pe [0, 11; 
i,k 


(8)_9(R®) is known as the space of Hida distributions [11]. Even though (4)—p (X) 
is not, in general, related to Lo o (VW; X), in many examples one can take either 
dk =q or qx = Cq* for some q < 1. For such weights, 


L2, QW; X) C (8)—o(X) 


with strict inclusion, so that the Wiener Chaos approach provides better regularity 
results. In addition, we remark that in contrast to the Wiener Chaos solutions, the 
white noise solutions are weak solutions in the probabilistic sense. 


4. Linear evolution equations in weighted Wiener Chaos spaces. In this 
section we present the main existence, uniqueness and regularity result for the 
Wiener Chaos solution of (3.1). We make the following assumptions: 


(Al) There exist positive numbers C, and 6 so that, for every v € V and t € [0, T], 
(4.1) (A(t)v, v) + dull < Civil 


(A2) There exists a real number C2 and a sequence of positive numbers Q = 
(qx, k => 1} so that, for every v € H and t € [0, T], 


(4.2) UAV, v) + Y  azl M (vll < Collull. 
kl 


(A3) The initial condition 4o is nonrandom and belongs to H; the process f = 
f (t) is deterministic and i Il f@ 1%, dt < oo; each gx = g(t) is a deter- 
ministic process and 


T 
X f. ade at < oo. 


k>1 


Denote by (F;,5,0 x s x t x T) the semigroup generated by the operator A; 
P, := P, o. In other words, the solution of the equation 


H ft 
v(t) = vo | Av() dz + | f(t) dt, O<s<t<T, 
S $ 
vo € H, f € L2((0, T); V^), is written as 
£ 
vE) = Pw | Pref (eae. 
S 


This solution exists by assumption (A1) and Theorem 3.3. Denote by uo) the 
solution of (3.8) when la = 0. 


650 S. V. LOTOTSKY AND B. L. ROZOVSKII 


THEOREM 4.1. Under assumptions (A1)-(A3), (3.1) has a unique Wiener 
Chaos solution. The solution u — u(t) has the following properties: 


(1) For every y € (0, 1), 
(4.3) uc Layo (W; L^ ((0, T); V)) 1 La ,o(W; C((0, Ty); H)). 
(2) For every 0 <t € T, u(t) € L2, 90 (W; H) and the following relations hold: 


os q" ua (t)&a 


la n 
l PSp 
È hh 


ky LEES ky>1 
$2 = 


x (Mic uo) + qi, Sky (81)) awe, (81) «+ dy, (Sn), 


n> 1, 
where Mj = gM; 


luO, owen 


(4.5) t t 
< 36! (tor +Cy Í IEO ds +Y i Gi \lex(s) 13, 2! 


k>1 

where the number C2 is from (4.2) and the positive number C ¢ depends only on 8 
and C from (4.1). 

PROOF. The arguments build on the techniques from [25]. 

(1) By Theorem 3.3, there exists a unique traditional solution of equation 

E. 5 
v(t) up + f (Av(s) + f(s) ds 

(4.6) : 
+5 Í (Mv(s) + gx(s))y qx dwi(s). 


k>1 


By Theorems 3.4 and 3.7, vg = y '^lg*u,, and (4.3) follows. 
(2) Direct computations show that, for |a| = 1 with až = 1, the corresponding 
solution Ug := u(jx of (3.8) is 


Ü 
(4.7) uk) (t) = i P, s (Mku (s) + gi(s))mi (s) ds. 


WIENER CHAOS SOLUTIONS 651 


Since gy) = ik = A mj(s)dwy(s), we conclude that 


X uian = [ P, s (Mito) (8) + gx(5)) dwe(s) 


i>] 
OT 
Y. quala = Y? ad P, s (Mru (s) + qug ()) dux (5). 
la|-1 k>1 


Continuing by induction on |o| and using the relation between the. Hermite poly- 
nomials and the iterated Itó integrals ([13], Theorem 3.1), we derive (4.4). 


An immediate consequence of (4.4) is the following energy equality: 


35 a" us CMS 


lo zn 


gt 


E B. Ps, Mg, *** Poss (Maio) + dk 8k) n ds”, 


where ds” = ds, --+dSp. 

To derive (4.5), consider first the homogeneous equation with f = gg = 0 and 
define Fp (t) = Ya q^" luo (0125, n = 0. Note that uo) (t) = Pug. 

By assumption (4.2), 


d 
(4.9) 7, AD < Co Fo(0) — 5 || Mi Pruoll- 
k>1 


For n > 1, the energy equality (4.8) implies 


i= b [ | f | Mr, Pus, Wty, «Mie, Ps,uo|2, ds"! 


e + X [ |. f UA Pi s, Ma, --- Mp, Ps to, 


ass] 
Pis, Mk, vee Mi, Ps, uo) ds” 
By assumption (4.2), 


M f [2 UA P, s, M, --- Mi, Ps uo, 
2 


(4.11) P, s, Mi, aes My, Ps, uo) ds" 


652 S. V. LOTOTSKY AND B. L. ROZOVSKII 


EM „> [ | f | Ma, i P; Sn Mg, °° - Mr, Ps, uo lz ds” 


peeks k nizi 


T D [fL [ma d, . Mr, P uo, ds". 


Kn zl 


As a result, for n > 1, 


d 
— Fí(t) x C 
dt (t) < 2Fn(t) 


Sn—] = 
4.12) + > [ hc 4 | Ma, Pus, M, o Ma, Ps, toll, ds" 


ki, K, 71 
ES 2 [ I 4d | Ma, P, sn Mka” Mg, Ps, uo| g ds". 
Ky,.--knticl 
Consequently, 
(4.13) fr X 4? bus (I <a> 3. gua (M. 
n=0 |al=n n=0 |æ |=n 


so that, by the Gronwall inequality, 


N 

(4.14) Y Y g” luO lh x eC uol 
n=0 |æ |—n 

or 

(4.15) lez, o cw; S e uol. 


The remaining two cases, namely, uo = gy = 0 and up = f = 0, are analyzed in 
the same way, and then (4.5) follows by the triangle inequality. Theorem 4.1 is 
proved. O 


COROLLARY 4.2. Let aj;, bj, c, Oik, vy be deterministic measurable functions 
of (t, x) so that 
[aij (t, x)| + Ibi (t, x)| + |e, x)| + loirt, x)| + le, x)| < K, 
i,j=l,...,d,k>1,x ER, ,0<t<T; 
(aij (t, x) — Zoik(t, x)a jelt, x)) iy; = 0, 
x,y e RG,0xtx T; and 


S wt. x) < Cy « oo, 
kz1 


WIENER CHAOS SOLUTIONS 653 


x eR, 0 « t € T. Consider the equation 


du = (Dj (aj; Dju) + bj Diu -- cu + f) dt 
(4.16) 
+ (oj, Diu + vku + gx) dux. 


Assume that the input data satisfy ug € Lo(R)) f € L3((0, T); Hy 1R4y,, 
Veet ll T qo ry gay < 0» and there exists an £ > 0 so that 


aij(t.x)yiyj;=elyl’?, | x, yeRA^0ztzrT. 


Then there exists a unique Wiener Chaos solution u — u(t, x) of (4.16). The solu- 
tion has the following regularity: 


(4.17) u(t,-) € Lo(W; LR), | OxtzT, 
and 


Ellul, gat) 


a g luol? ma + IFI? +Y llel? 
= ONT (R¢) LAOT) H; R) T Eki Ir (0, T) XR) }? 


where the positive number C* depends only on Cy, K,T and s. 


PROOF. Direct computation shows that the operators 
A = D; (ai; Dj) + bi Di + c, Mk = Oik Dj + vk 


satisfy assumptions (A1) and (A2) in the normal triple (Hj (RZ), L4(R9), 
H, : (R2)) with gk = 1. Then relations (4.17) and (4.18) follow from (4.5). LI] 


Note that in contrast to the statement of this corollary, all previous results con- 
cerning (4.16) required additional regularity of the coefficients and input data; see, 
for example, [35], Section 4.2. 

Theorem 4.1 is a bona fide extension of Theorem 3.3. Indeed, if condition (3.5) 
holds so that (3.1) is strongly parabolic, then, taking Q — 1, we recover the state- 
ment of Theorem 3.3. Further analysis of condition (3.5) indicates that, for a 
strongly parabolic equation, one can find an admissible weight sequence Q so that 
gk =q > 1. Since for Q > 1 we have a strict inclusion L2, o(W; X) C L2(W; X), 
Theorem 4.1 represents an improvement of Theorem 3.3 for strongly parabolic 
equations. 

For weakly parabolic equations, similarly to the proof of Corollary 4.2, one can 
take Q — 1, and then Theorem 4.1 represents an extension of the existing results 
([35], Sections 3.2 and 4.2). 

For nonparabolic equations, the results of Theorem 4.1 are completely new. 


654 S. V. LOTOTSKY AND B. L. ROZOVSKII 


REMARK 4.3. The formal Fourier series u = S’yegUaéa for the Wiener 
Chaos solution is a generalization of the representation formula for solutions of 
SODE’s, derived by ([20], Theorem 4) using iterated It6 integrals; see also [23], 
Theorem 3.2. Indeed, 


NS »( y tata) 


Qef nzÜ Mo |n 


and equality (4.4) connects the inner sum on the right with the iterated integrals. 
see also Example 5.2 below. 


5. Probabilistic representation of Wiener Chaos solutions. As before, let 
F = (9, F,{F:}o0<:<7, P) be a stochastic basis with the usual assumptions and let 
W = (wy(t), k> 1,0<t x T) bea collection of standard Wiener processes on F. 
Consider the linear equation in R? 


(5.1) du (aij DiDju + bj Diu - cu + f) dt + (oik Diu + vpu + gx) duy 
under the following assumptions: 


(BO) Ali coefficients, free terms and the initial condition are nonrandom. 
(B1) o functions aj; = aij (t, x) are measurable and bounded in (f, x), and 
i 
laij (t, x) —aij(t,y)|<Clx—y], — x, yeR5,0stsT, 
with C independent of t, x, y; 
(ii) the matrix (a;;) is uniformly positive definite, that is, there exists a ô > 0 
so that, for all vectors y € R7 and all (t, x), diiyiyj = 81y[^. 
(B2) The functions b; = bj (t, x), c = c(t, x) and vy = v (t, x) are measurable and 
bounded in (t, x). 
(B3) The functions o; = cik (t, x) are continuous and bounded in (t, x). 
(B4) There exists a p > d + 1 so that the functions f = f(t, x) and gy = g(t, x) 
belong to | 


L (0, T); LARINA Lp (R^)). 
(B5) The initial condition uo = uo(x) belongs to L2 (Rf)  WZ(R?), p 7 d +1, 
where W? is the Sobolev space {f : f, Di f, DiD; f € L5 (R^)]. 


Under assumptions (B2)-(B4), there exists a sequence Q = (qx, k > 1} of posi- 
tive numbers with the following properties: 


(P1) The matrix A with Aj; = ai; — (1/2) 2k qkOikO jk Satishies Aij (t, x)yiyj = 


0,x,yeRZ,0xr«T. 
(P2) There exists a number C > 0 so that 


T 
Y (sup larve (t, x)|7 + Í gi gk Ne paay ) at) <C. 


k>] tX 


WIENER CHAOS SOLUTIONS 655 


For the matrix A and each t,x, we have A;; (t, x) = Gik (t, x)ojx(t, x), where 
the functions 6;, are bounded. This representation might not be unique; see, for 
example, [9], Theorem III.2.2 or [36], Lemma 5.2.1. Given any such representation 
of A, consider the following backward Itó equation: 


X, xi) =i + f 5« (t, Xi. (0) dx + Y. [ quoi (c, Xe x(1)) dw (c) 


k-1 
(5.2) 
t 
+f o; (T, X, x (1)) dy (1); s € (0, t), t € (0, T], t-fixed, 
5 
where B; = b; — Y 4-1 qt oikvy and iy, k > 1, are independent standard Wiener 
processes on F that are independent of wg, k > 1. This equation might not have 
a strong solution, but does have weak, or martingale, solutions due to assumptions 
(B1)-(B3) and properties (P1) and (P2) of the sequence Q; this weak solution is 
unique in the sense of probability law ([36], Theorem 7.2.1). 
Let Q be the operator 


Q:{ug, a € J} (q^ us, a € F}. 


THEOREM 5.1. Under assumptions (BO)-(B5), (5.1) has a unique Wiener 
Chaos solution u = u(t, x). If Q is a sequence with properties (P1) and (P2), then 
u(t,-) € Lo o(W; Lj(Rd )), 0<t <T, and the following representation holds: 


ult, x)= ave f) f(s, Xpx(s)) v(t, s,x)ds 


(5.3) > I qeu (s, Xr x(8))y (t, s, x) dung (s) 


k>1 
+ ug(X;z,x(0)) y (t, 0, xis") 
t < T,x € RE, where X, x(5) is a weak solution 7 (5.2) and 


ys. men f c(t, X x(t) )dt - 3. E. t, (t, Xp,x(t)) w(t) 
kei’? 
(5.4) 


af 3 air, X; x(1))] ^a). 


k>1 
PROOF. It is enough to establish (5.3) when t = T. Consider the equation 
U = (aij Di DjU -- bj DjU +cU + f)dt 


(5.5) +) (oix D;U + wU + gy)qy duy 
k>1 


656 5. V. LOTOTSKY AND B. L. ROZOVSKII 


with initial condition U (0, x) = uo(x). In Theorem 4.1, take Q = 1 and consider 
the normal triple (H1 (R^), La (R7), H5 + (R4)). Then (5.5) has a unique Wiener 
Chaos solution U and 


U(t,-) e Lo(W; LR), | 0xt«T. 


By construction, process u = (Q^ !U is the corresponding Wiener Chaos so- 
lution of (5.1). To establish representation (5.3), consider the function Up = 
E(U € (h)), h € J€. According to Theorem 3.7, the function Up is the unique solu- 
tion of the equation 


dUj, = (aij Di Dj; Un +b; Di Un + cUj + f) dt 


+) (oix DiUn + vkUn + gy)quhy dt 
kzl 


(5.6) 


with initial condition U;,|;..9 = uo. We also define 


T 
Y(T, x) = f f(s, Xrx(s))y(T, s, x) ds 


I 
(5.7) gy Í gi (s, Xr. 5))y (T, 8) qu dux (s) 


k21 
+ uo(XT,x(0))y (T, 0, x). 


By direct computation, 
E(E(E(WY (T, x)| Fz" )) 2 E(E(W)Y (T, x)) = E'Y (T, x), 


where E’ is the expectation with respect to the measure dP, = €(h)dPr and Pr 
is the restriction of P to $2. 

Under assumptions (BO)-(B5), the solution Up of (5.6) is continuous in (t, x) 
and has a probabilistic representation via the Feynmann—Kac formula; see [21], 
Section 1.6. Using the Girsanov theorem ([14], Theorem 3.5.1), this representation 
can be written as U,(T, x) = E’Y(T, x) or 


E(&(R)U (T, x)) = E(E WEY (t, DIF). 


By Remark 3.6, the last equality implies U (T, x) = E(Y(T, DIFP) as elements 
of La(W). 
Theorem 5.1 is proved. 


EXAMPLE 5.2 (Krylov—Veretennikov formula). Consider the equation 


d 
(5.8 du-(aijDiDju * biDjiu)dt - 3  oiDiudwg, — u(0, x) = uo(x). 
k=l 


WIENER CHAOS SOLUTIONS 657 


Assume (B0)-(B5) and suppose that aj; (f, x) = loj(t, x)o jy(t, x). By Corol- 
lary 4.2, (5.8) has a square integrable solution such that 
Ellwl?. aay) < C” Iluoll7, ga 


By Theorem 4.1 with Q = 1, 


u(t,x)= ^ 2, Ualt,x)Ee 


n=] |a|-—n 
= uo(x) 
(5.9) 
oo d £ Sy $2 
+> X [I ZI Pus 0j Dj 
n=] ky,...,%,7=1 0 0 0 


ius P, si Oik, Dj Ps, ouo(x) duy, (s1) e dux, (Sn), 


where P, s is the semigroup generated by the operator “A = aj; DiD;u + bi Diu. 
On the other hand, in this case, Theorem 5.1 yields 


u(t, x) = E(uo(X1,x))1F;,"), 
where W = (w1,..., Wg) and 
t d t 
Xii) oxi | bites Xi) de + Y, [| our. Xi) dux CD. 
(5.10) : k=1"5 
s € (0, ż),t € (0, T], t-fixed. 
Thus, we have arrived at the Krylov- Veretennikov formula (cf. [20], Theorem 4): 
E(uo(X;,«(0))1 9," ) 


OO d t Sn $2 
(5.11) =uo(x)+ >> > If -f Pts 7 jky Dj 


nz] kj, Kul 
+++ Psp 91 city Diu oy (x) dug G1) -: - dux, (Sn). 
6. Passive scalar in a Gaussian field. The following viscous transport equa- 


tion is used to describe time evolution of a passive scalar 0 in a given velocity 
field v: 


(61) 6(nx)-vA0(t x) —v(t,x)- VO, X+ ft, x); | xeRA,d» 1. 


In the Kraichnan model of turbulent transport [10, 18], using the results from 
[1] and [23], (6.1) can be written as an Itó stochastic evolution equation: 


(6.2) . d6(t,x) — (vA0(t, x) + f (t, x))dt —ox(x)- VO(t, x) duk(t), 


where 


658 S. V. LOTOTSKY AND B. L. ROZOVSKII 


(S1) (wx(r), k > 1, ¢ = 0} is a collection of independent Wiener processes. 

(SZ) (ox, k > 1} is an orthonormal basis in the space Hc, the reproducing kernel 
Hilbert space corresponding to the spatial covariance function C of v, so that 
ot (x)oj (y) = Ev! (x) v4 (y) = C" (x — y) and CH (0) = 6;;. The space Hc is 
all or part of the Sobolev space H 4^?) (R4; RÌ), y c (0, 2). 

(S3) The initial condition 4) is nonrandom and belongs to La (IR^). 


Using the Wiener Chaos approach, it-is possible to consider velocity fields v 
that are even more turbulent, for example, 


(6.3) v (f, x) = 5 og Qs (t), 


k>0 


where (ox, k > 1} is an orthonormal basis in Lo (R4: Rd ). With v as in (6.3), the 
passive scalar equation (6.2) becomes 


(6.4) Ó(t, x) = v AO(t, x) + f (t, x) — VOG, x)- W(t, x), 


where W — W (t, x) is a d-dimensional space-time white noise. 

In [6] and [34], a similar equation is studied using the white noise approach. 
For more related results and references, see [12], Section 4.3. We will consider an 
even more general equation. 


THEOREM 6.1. Suppose that v > 0 is a real number, the functions o] (x) are 
bounded and measurable, 0) € La (RÀ) and f € L2((0, T); H7 | (RÀ)). 
Fix e > 0 and let Q = (qx, k > 1) be a sequence so that, for all x, y € RÊ, 
wiy? — Y^ goio} QOyiy; = ely? 
k>1 
Then there exists a unique Wiener Chaos solution of (6.2). This solution satisfies 


0 € Lo, o(W; L2((0, T); H1 (R^))) N L2, 9(W; C((0, T); L2(R%))). 


PROOF. The result follows from Theorem 4.1, with A = vA, My = o; Di, in 
the normal triple (H1 (R), L2(R), H5;  (R). O 


REMARK 6.2. If max; sup, loj (x)| < Cy, k > 1, then a possible choice of Q is 
qx = (8v) 7 /(d2^ Cy), 0 < 8 « 2. 

If oi (x)oj (x) XC, < +œ, i, j=1,...,d, x € RZ, then a possible choice of 
Q is qy = £Qv/(Cs d))!?,0 « & « 1. 


When v — 0, (6.2) describes nonviscous transport [8] and can still be studied if 
interpreted in the Stratonovich sense: 


(6.5) du(t, x) = f (t, x) dt — ox(x) - VO(t, x) o dwy(t). 


WIENER CHAOS SOLUTIONS 659 


In what follows, we assume that f = 0, each oy is divergence free and 
oj (x)oi (x) = ójj. Then (6.5) has an equivalent Itó form 
(6.6) d (t, x) = $ A0 (t, x) dt — ot (x) D:0 (t, x) dwy(t). 


The following result (cf. [25]) summarizes the main facts about the Wiener Chaos 
solution of (6.6) 


THEOREM 6.3. In addition to (S1)~(S3), assume that each oy is divergence 
free. Then there exists a unique Wiener Chaos solution 0 = 0(t, x) of (6.6). This 
solution has the following properties: 

(i) For each t € [0, T], 0(t, -) € L2 (R2). 
(ii) For every e € CX (R^) and all t € [0, T], the equality 


t t . 
(6.7) (0.90 = 6.9) +} f (0, Ae)G)ds + f (0,0; Dip)dwg(s) 


holds in Lo(W), where (-, -) is the inner product in Lj (R2). 
(ii) fo € W2(R^), p>d-+1,and X is a weak solution of 


f 
(6.8) kest [ ox (X x) dux (s), 
then 
(6.9) O(t, x) = ECX dF), | Oct T,x e Ra. 


(iv) For 1 < p < oo and r € R, denote by Lp (4 (R7) the space of measurable 
functions with finite norm 


P vas p 2x pr/2 
IFE oo = fu SOPA + Ina. 
Then there exists anumber K depending only on p andr so that 


(6.10) EloI ) < e! leol? t>0. 


p,(r) R$) (t Lp nR 


In particular, if r = 0, then K = 0. 


PROOF. (i) This follows from Theorem 4.1, because (6.6) is weakly parabolic 
and one can take Q = 1. 

(ii) Since for each k we have Dj ci — 0 in the sense of distributions, the same 
arguments as in the proof of the second part of Theorem 3.4 result in (6.7). 

(iii) Equality (6.9) follows from Theorem 5.1 after observing that the time- 
homogeneity of oy allows us to rewrite the corresponding backward equation 
as (6.8). 


660 S. V. LOTOTSKY AND B. L. ROZOVSKII 


(iv) To prove (6.10), denote by S; :09(-) +> O(t, -), t > 0, the solution operator 
for (6.6). Direct calculations show that, for every r € R, assumptions (A1)-(A2) 
of Theorem 4.1 are satisfied with V = HJ (R^) = {f : f, Dif € Lp, o) RIN, 


H = L2, (R^), V’ = H; (R°) and Q = 1. Then 5; is a bounded linear operator 
from L2 (r) (R2) to La (ry) (82 x R^) and 


15,06 Lj (y($xR2) < ez Ft Volz. o (R4) 


for every r € IR, where the number K, depends only on r, and 


2 "A 2 
IFT, coxa) = ENS Wr, aR: 


The calculations also show that K2 = 0 if r — 0. 
By (6.9), S; is a bounded linear operator from Lo4 (R7) to Log(Q x R7) and 


[Soll cox) < leol qua 


Interpreting the last inequality in the weighted spaces Loo, (jj with weight r = 0, 
we interpolate between Loo = Loo, (0) and L» (»j and derive (6.10) from the Stein 
interpolation theorem in weighted spaces (see, e.g., [2], Theorem 4.3.6). Theo- 
rem 6.3 is proved. C 


Acknowledgment. We are indebted to the referee for a number of insightful 
suggestions. 


REFERENCES 


[1] BAXENDALE, P. and HARRIS, T. E. (1986). Isotropic stochastic flows. Ann. Probab. 14 
1155-1179. MR0866340 

[2] BENNETT, C. and SHARPLEY, R. (1988). Interpolation of Operators. Academic Press, Boston. 
MR0928802 

[3] BUDHIRAJA, A. and KALLIANPUR, G. (1996). Approximations to the solution of the Zakai 
equations using multiple Wiener and Stratonovich integral expansions. Stochastics Sto- 
chastics Rep. 56 271—315. MR1396764 

[4] CAMERON, R. H. and MARTIN, W. T. (1947). The orthogonal development of nonlinear func- 
tionals in a series of Fourier-Hermite functions. Ann. of Math. 48 385—392. MR0020230 

[5] DA PRATO, G. and ZABCZYK, J. (1992). Stochastic Equations in Infinite Dimensions. Cam- 
bridge Univ. Press. MR1207136 

[6] DECK, T. and POTTHOFF, J. (1998). On a class of stochastic partial differential equations 
related to turbulent transport. Probab. Theory Related Fields 111 101—122. MR1626774 

[7] DOERING, C. R. and GIBBON, J. D. (1995). Applied Analysis of the Navier-Stokes Equations. 
Cambridge Univ. Press. MR1325465 

[8] E, W. and VANDEN EDEN, E. (2000). Generalized flows, intrinsic stochasticity, and turbulent 
transport. Proc. Natl. Acad. Sci. USA 97 8200-8205. MR1771642 

[9] FREIDLIN, M. (1985). Functional Integration and Partial Differential Equations. Princeton 
Univ. Press. MR0833742 

[10] GAWEDZKI, K. and VERGASSOLA, M. (2000). Phase transition in the passive scalar advection. 

Phys. D 138 63-90. MR1745418 


[11] 
[12] 


[15] 
[14] 


[15] 
[16] 


[17] 
[18] 


[19] 


[20] 
[21] 
[22] 
[23] 
[24] 
[25] 


[26] 


[27] 


[28] 


[29] 


[30] 
[31] 
. [32] 


[33] 


WIENER CHAOS SOLUTIONS 661 


HIDA, T., Kuo, H.-H., POTTHOFF, J. and SREIT, L. (1993). White Noise. Kluwer Academic 
Publishers, Boston. MR1244577 

HOLDEN, H., @KSENDAL, B., UB@E, J. and ZHANG, T. (1996). Stochastic Partial Differential 
Equations. Birkháuser, Boston. MR1408433 

ITÓ, K. (1951). Multiple Wiener integral. J. Math. Soc. Japan 3 157—169. MR0044064 

KARATZAS, I. and SHREVE, S. E. (1991). Brownian Motion and Stochastic Calculus, 2nd ed. 
Springer, New York. MR1121940 

KÓTHE, G. (1968). Über nukleare Folgenráume. Studia Math. 34 267—271. (In German.) 
MR023665 1 

KOTHE, G. (1970). Stark nukleare Folgenriiume. J. Fac. Sci. Univ. Tokyo Sect. I 17 291—296. 
(In German.) MR0280970 

KOTHE, G. (1971). Nuclear sequence spaces. Math. Balkanica 1 144—146. MR0288550 

KRAICHNAN, R. H. (1968). Small-scale structure of a scalar field convected by turbulence. 
Phys. Fluids 11 945—963. 

KRYLOV, N. V. (1999). An analytic approach to SPDEs. In Stochastic Partial Differential 
Equations: Six Perspectives. Mathematical Surveys and Monographs (B. L. Rozovskii 
and R. Carmona, eds.) 185—242. Amer. Math. Soc., Providence, RI. MR1661766 

KRYLOV, N. V. and VERETENNIKOV, A. J. (1976). On explicit formula for solutions of sto- 
chastic equations. Math. USSR Sbornik 29 239—256. MR0410921 

KRYLOV, N. V. and ZVONKIN, A. K. (1981). On strong solutions of stochastic differential 
equations. Sel. Math. Sov. 1 19-61. 

KUNITA, H. (1982). Stochastic Flows and Stochastic Differential Equations. Cambridge Univ. 
Press. MR1070361 

LEJAN, Y. and RAIMOND, O. (2002). Integration of Brownian vector fields. Ann. Probab. 30 
826—873. MR1905858 

LOTOTSKY, S. V., MIKULEVICIUS, R. and ROZOVSKII, B. L. (1997). Nonlinear filtering 
revisited: A spectral approach. SIAM J. Control. Optim. 35 435-461. MR1436632 

LOTOTSKY, S. V. and ROZOVSKII, B. L. (2004). Passive scalar equation in a turbulent incom- 
pressible Gaussian velocity field. Rusian Math. Surveys 59 297-312. MR2086638 

MATINGLY, J. C. and SINAI, Y. G. (1999). An elementary proof of the existence and unique- 
ness theorem for the Navier-Stokes equations. Commun. Contemp. Math. 1 497-516. 
MR1719695 

MIKULEVICIUS, R. and ROZOVSKU, B. L. (1993). Separation of observations and parameters 
in nonlinear filtering. In Proc. 32nd IEEE Conf. on Decision and Control, San Antonio, 
Texas, 1993 2 1564—1559. IEEE Control Systems Society. 

MIKULEVICIUS, R. and ROZOVSKII, B. L. (1998). Linear parabolic stochastic PDE’s and 
Wiener chaos. SJAM J. Math. Anal. 29 452-480. MR1616515 

MIKULEVICIUS, R. and ROZOVSKII, B. L. (2001). Stochastic Navier-Stokes equations. Prop- 
agation of chaos and statistical moments. In Optimal Control and Partial Differential 
Equations: In Honour of Alain Bensoussan (J. L. Menaldi, E. Rofman and A. Sulem, 
eds.) 258—267. IOS Press, Amsterdam. 

NUALART, D. and ROZOVSKII, B. (1997). Weighted stochastic Sobolev spaces and bilinear 
SPDE’s driven by space-time white noise. J. Funct. Anal. 149 200-225. MR1471105 

OCONE, D. (1983). Multiple integral expansions for nonlinear filtering. Stochastics 10 1—30. 
MR0714705 

ØKSENDAL, B. K. (1998). Stochastic Differential Equations: An Introduction With Applica- 
tions, 5th ed. Springer, Berlin. MR1619188 

PARDOUX, E. (1975). Equations aux derivéespartielles stochastiques non linearies monotones. 
Etude de solutions fortes de type Itó. Univ. Paris Sud, thése Doct. Sci. Math. 


662 S. V. LOTOTSKY AND B. L. ROZOVSKII 


[34] POTTHOFF, J., VÅGE, G. and WATANABE, H. (1998). Generalized. solutions of linear 
parabolic stochastic partial differential equations. Appl. Math. Optim. 38 95-107. 
MR1620803 

[35] ROZOVSKII, B. L. (1990). Stochastic Evolution Systems. Kluwer Academic Publishers, Dor- 
drecht. MR1135324 

[36] STROOCK, D. W. and VARADHAN, S. R. S. (1979). Multidimensional aciei Processes. 
Springer, Berlin. MR0532498 

[37] WONG, E. (1981). Explicit solutions to a class of nonlinear filtering problems. Stochastics 16 
311-321. MR0636084 


DEPARTMENT OF MATHEMATICS 
UNIVERSITY OF SOUTHERN CALIFORNIA 
3620 S. VERMONT AVE., KAP 108 
LOS ANGELES, CALIFORNIA 90089-2532 ` 
USA 
E-MAIL: lototsky @ math.usc.edu 
rozovski@ math.usc.edu 
URL: www.usc.edu/schools/college/mathematics/people/faculty/lototsky.html 
www.usc.edu/dep/LAS/CAMS/usr/facmemb/boris/main.htm 


The Annals of Probability 

2006, Vol. 34, No, 2, 663-727 

DOT: 10.1214/009117905000000666 

© Institute of Mathematical Statistics, 2006 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS: 

WELL-POSEDNESS AND REGULARITY OF SOLUTIONS, 

WITH APPLICATIONS TO STOCHASTIC GENERALIZED 
BURGERS EQUATIONS! 


By MICHAEL RÓCKNER AND ZEEV SOBOL 
Universität Bielefeld and University of Wales Swansea 


We develop a new method to uniquely solve a latge class of heat equa- 
tions, so-called Kolmogorov equations in infinitely many variables. The equa- 
tions are analyzed in spaces of sequentially weakly continuous functions 
weighted by proper (Lyapunov type) functions. This way for the first time 


tions with possibly nonlocally Lipschitz drifts. Apart from general analytic 
interest, the main motivation is to apply this to uniquely solve martingale 
problems in the sense of Stroock~Varadhan given by stochastic partial differ- 
ential equations from hydrodynamics, such as the stochastic Navier-Stokes 
equations. In this paper this is done in the case of the stochastic generalized 
Burgers equation. Uniqueness is shown in the sense of Markov flows. 


1. Introduction. In this paper we develop a new technique to uniquely solve 
generalized heat equations, so-called Kolmogorov equations, in infinitely many 
variables of type 


for a large class of elliptic operators L. The main new idea is to study L on 
weighted function spaces consisting of séquentially weakly continuous func- 
tions on the underlying infinite-dimensional Banach space X (e.g., a classical 
L?-space). These function spaces are chosen appropriately for the specifically 
given operator L. More precisely, the function space on which L acts is weighted 
by a properly chosen Lyapunov function V of L and the image space by a func- 
tion © bounding its image LV. Apart from general analytic interest, the motivaton 
for this work comes from the study of concrete stochastic partial differential equa- 
tions (SPDEs), such as, for example, those occuring in hydrodynamics (stochastic 


Received January 2004; revised March 2005. 
I Supported by the BiBoS-Research Centre and the DFG-Research Group "Spectral Analysis, As- 

ymptotic Distributions and Stochastic Dynamics." 

AMS 2000 subject classifications. Primary 35R15, 47D06, 47D07, 60J35, 60360; secondary 
35370, 35Q53, 60H15. 
. Key words and phrases. Stochastic Burgers equation, Kolmogorov equation, infinite-dimensional 
background space, weighted space of continuous functions, Lyapunov function, Feller semigroup, 
diffusion process. 


663 


664 M. RÓCKNER AND Z. SOBOL 


Navier-Stokes or Burgers equations, etc.). Transition probabilities of their solu- 
tions satisfy such Kolmogorov equations in infinitely many variables. To be more 
specific, below we shall describe a concrete case, to which we restrict in this paper, 
to explain the method in detail. 

Consider the following stochastic partial differential equation on 


X := L^(0, 1) = L?((0, 1), dr) 
(where dr denotes Lebesgue measure): 


( dx, = (Ax; s F(x)) dt 4- A A dw, 
I.I) 
XQ — x € X. 


Here A: X — X is a nonnegative definite symmetric operator of trace class, 
(wr)i»o a cylindrical Brownian motion on X, A denotes the Dirichlet Laplacian 
(i.e., with Dirichlet boundary conditions) on (0, 1), and F : Hj — X if a measur- 

able vector field of type 


F(x)(r):— Lw o x)(r) + é(r, x(r)), xE Hi (0, D, r e (0, 1). 


Hi = HG (0, 1) denotes the Sobolev space of order 1 in L?(0, 1) with Dirichlet 
boundary conditions and WV:IR -> R, d:(0, 1) x R — R are functions satisfy- 
ing certain conditions specified below. In case W(x) = 5x7, ® = 0, SPDE (1.1) 
is just the classical stochastic Burgers equation, and if V = 0 and, for example, 
(r, x) = —x3, we are in the situation of a classical stochastic reaction diffusion 
equation of Ginsburg—Landau type. Therefore, we call (1.1) “stochastic general- 
ized Burgers equation.” 

Stochastic generalized Burgers equations have been studied in several papers. In 
fact, the first who included both a “hydrodynamic part" (i.e., V above) and a “reac- 
tion diffusion part” (1.e., P above) was Gyöngy in [29], where, as we do in this pa- 
per, he also considered the case where the underlying domain is D = (0, 1). Later 
jointly with Rovira in [31] he generalized his results to the case where V is allowed 
to have polynomial growth; ® is still assumed to have linear growth and is locally 
Lipschitz with at most linearly growing Lipschitz constant. A further generaliza- 
tion to d-dimensional domains was done by the same two authors in [32]. Contrary 
to us, these authors purely concentrated on solving SPDE of type (1.1) directly and 
did not analyze the corresponding Kolmogorov equations. In fact, they can allow 
nonconstant (but globally Lipschitz) VA and also explicitly time dependent co- 
efficients. We refer to [29, 31, 32] for the exact conditions, but emphasize that 
always the reaction diffusion part is assumed to be locally Lipschitz and of at most 
linear growth. As we shall see below, for the solution of the Kolmogorov equa- 
tions, our method allows the reaction diffusion part to be of polynomial growth 
(so Ginsburg—Landau is in fact included) and also the locally Lipschitz condi- 
tion can be replaced by a much weaker condition of dissipative type [see condi- 
tions (1)—(3) in Section 2 below]. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 665 


SPDE of type (1.1) with either Y = 0 or ® = 0 have been studied extensively. 
For the case V = 0, the literature is so enormous that we cannot record it here, 
but instead refer, for example, to the monographs [24] and [13] and the references 
therein. For the case ® = 0, we refer, for example, to [6, 10, 12, 18, 19, 30, 38, 39, 
55], and for the classical deterministic case, for example, to [11, 33, 37, 41, 44]. 
References concerning the Kolmogorov equations for SPDE will be given below. 

The motivation of handling both the hydrodynamic and reaction diffusion part 
in SPDE of type (1.1) together was already laid out in [29]. It is well known that the 
mathematical analysis is then much harder, standard theory has to be modified and 
new techniques must be developed. It is, however, somehow imaginable that this, 
with some effort, can be done if as in [29, 31, 32] ® has at most linear growth (see, 
e.g., Remark 8.2 in [35], where this is shown in a finite-dimensional situation). 
The case of ® with polynomial growth treated in this paper seems, however, much 
harder. In contrast to [29, 31, 32], our methods require, on the other hand, that Y 
grows less than 1x [772 for large x [cf. condition (W) in Section 2]. 

Showing the range of our method by handling ® and W together has the dis- 
advantage that it makes the analysis technically quite hard. Therefore, the reader 
who only wants to understand the basic ideas of our new general approach is ad- 
vised to read the paper under the assumption that ® does not explicitly depend 
on r and has polynomial growth strictly less than 5. This simplifies the analysis 
substantially [e.g., in definition (2.4) of the Lyapunov function below we can take 
p = 2, so the simpler weight functions in (2.3) below suffice]. 

But now let us turn back to the Kolmogorov equations corresponding to 
SPDE (1.1). 

A heuristic (1.e., not worrying about existence of solutions) application of Itó's 
formula to (1.1) implies that the corresponding generator or Kolmogorov opera- 
tor L on smooth cylinder functions u : X — R, that is, 


u € Dis FC? = {u = go Py|N EN, g € Ce(En)} (cf. below), 
is of the following form: 


- Lu(x) :— 3 Tr(AD^u(x)) + (Ax + F(x), Du(x)) 


OO OO 
-i Ag) Y (Ax + F(x), me)deulx), x € Hg. 


Here n(r) := A/2sin(zkr), k € N, is the eigenbasis of A in L?(0, 1), equipped 
with the usual inner product (-, ), Ey :— span(y,|l x k x N}, Py is the corre- 
sponding orthogonal projection, and Aj; :— (ni, An;), i, j € N. Finally, Du, D*u 
denote the first and second Fréchet derivatives, 9, :— dp, 9f, := 05,05; with dy := 
directional derivative in direction y € X and (Ax, nx) := (x, Ang) for x € X. 


666 M. RÓCKNER AND Z. SOBOL 


Hence, the Kolmogorov equations corresponding to SPDE (1.1) are given by 


d 5 
ru Qala «ex 
(1.3) : 
v(0, -) — f. 


where the function f : X — R is a given initial condition for this parabolic PDE 
with variables in the infinite-dimensional space X. We emphasize that (1.3) is only 
reasonable for some extension L of L (whose construction is an essential part of 
the entire problem) since even for f € D, it will essentially never be true that 
v(t,-) € D. 

Because of the lack of techniques to solve PDE in infinite dimensions, in sit- 
uations as described above the "classical" approach to solve (1.3) was to first 
solve (1.1) and then show in what sense the transition probabilities of the solution 
solve (1.3) (cf., e.g., [3, 13, 17, 24, 26, 27, 45, 50] and the references therein). 
since about 1998, however, a substantial part of recent work in this area (cf., 
e.g., [20, 52, 53] and one of the initiating papers, [46]) is based on the attempt 
to solve Kolmogorov equations in infinitely many variables [as (1.3) above] di- 
rectly and, reversing strategies, use the solution to construct weak solutions, that 
is, solutions in the sense of a martingale problem as formulated by Stroock and 
Varadhan (cf. [54]) of SPDE as (1.1) above, even for very singular coefficients 
(naturally appearing in many applications). In the above quoted papers, às in sev- 
eral other works (e.g., [1, 4, 15, 16, 22, 23, 42]), the approach to solve (1.3) directly 
was, however, based on L? (u)-techniques where jz is a suitably chosen measure 
depending on L, for example, jz is taken to be an infinitesimally invariant measure 
of L (see below). So, only solutions to (1.3) in an LP (q)-sense were obtained, in 
particular, allowing jz-zero sets of x € X for which (1.3) does not hold or where 
(1.3) only holds for x in the topological support of yz (cf. [20]).- 

In this paper we shall present a new method to solve (1.3) for all x € X (or an 
explicitly described subset thereof) not using any reference measure. It is based on 
finite-dimensional approximation, obtaining a solution which, despite the lack of 
(elliptic and) parabolic regularity results on infinite-dimensional spaces, will, nev- 
ertheless, have regularity properties. More precisely, setting X  :— LP ((0,1), dr), 
we shall construct a semigroup of Markov probability kernels p;(x, dy), x € Xp, 
t > 0, on X, such that, for all u € D, we have t +> p;({Lu|)(x) is locally Lebesgue 
integrable on [0, oo) and 


| | 
(1.4) pru(x) — u(x) = Í ps(Lu)(x)ds Vxe€X,. 
Here, as usual for a measurable function f : X, — R, we set 


(1.5) pfe | fipG,dy — xeXy,t» 9, 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 667 


if this integral exists. p has to be large enough compared to the growth of ® (cf. 
Theorem 2.2 below). Furthermore, p; for each t > 0 maps a class of sequentially 
weakly continuous (resp. a class of locally Lipschitz functions) growing at most 
exponentially into itself. That p;, for t > 0, has the property to map the test func- 
tion space D (consisting of finitely based, hence, sequentially weakly continuous 
functions) into itself (as is the case in finite dimensions at least if the coefficients 
are sufficiently regular) cannot be true in our case since F depends on all co- 
ordinates of x = 2r 4 (x, nk)ng and not merely finitely many. So, the regularity 
property of p;, t > 0, to leave the space of exponentially bounded (and, since it is 
Markov, hence, also the bounded) sequentially weakly continuous functions fixed 
is the next best possible. 

As a second step, we shall construct a conservative strong Markov process 
with weakly continuous paths, which is unique under a mild growth condition and 
which solves the martingale problem given by L, as in (1.2) and, hence, also (1.1) 
weakly, for every starting point x € Xp. We also construct an invariant measure 
for this process. 

The precise formulation of these results require more preparations and are there- 
fore postponed to the next section (cf. Theorems 2.2-2.4), where we also collect 
our precise assumptions. Now we would like to indicate the main ideas of the 
proof and the main concepts. First of all, we emphasize that these concepts are 
of a general nature and work in other situations as well (cf., e.g., the companion 
paper [47] on the 2D-stochastic Navier-Stokes equations). We restrict ourselves to 
the case described above, so in particular to the (one dimensional) interval (0, 1) 
for the underlying state space X, = LP((0, 1), dr), in order to avoid additional 
complications. 

The general strategy is to construct the semigroup solving (1.4) through its cor- 
responding resolvent, that is, we have to solve the equation 


(.— Du - f 


for all f in a function space and A large enough, so that all u € D appear as 
solutions. The proper function spaces turn out to be weighted spaces of sequen- 
tially weakly continuous functions on X. Such spaces are useful since their dual 
spaces are spaces of measures, so despite the nonlocal compactness of the state 
space X, positive linear functionals on such function spaces over X are automati- 
cally measures (hence, positive operators on it are automatically kernels of positive 
measures). To choose exponential weights is natural to make these function spaces, 
which will remain invariant under the to be constructed resolvents and semigroups, 
as large as possible. More precisely, one chooses a Lyapunov function Vp, of L 
with weakly compact level sets so that 


(A SX L) VK = On «> 


and so that ©p is a "large" positive function of (weakly) compact level sets 
[cf. (2.3), (2.4) below for the precise definitions]. ©), “measures” the coerciv- 
ity of L [or of SPDE (1.1)]. Then one considers the corresponding spaces WC, 


668 M. RÓCKNER AND Z. SOBOL 


and WiCy, of sequentially weakly continuous functions over X, weighted by 
Vox and Opx, respectively, with the corresponding weighted supnorms [cf. (2.2) 
below]. Then for A large, we consider the operator 


A—L:DCWCpyc— Wily x 
and prove by an approximative maximum principle that, for some m > 0, 


IQ. — Dulw;c,, = mllullwe,. 


(cf. Proposition 6.1). So we obtain dissipativity of this operator between these two 
different spaces and the existence of its continuous inverse G} :— (A — L)~!. Con- 
sidering a finite-dimensional approximation by operators Ly on Ey, N €N, with 
nice coefficients, more precisely, considering their associated resolvents ( Gy )1»0; 
we show that (A — L)(4D) has dense range and that the continuous extension of G3 
to all of WiC). is still one-to-one (“essential maximal dissipativity"). Further- 
more, AG M (lifted to all of X) converges uniformly in A to AG, which, hence, turns 
out to be strongly continuous, but only after restricting G3 to WC, ,, which is con- 
tinuously embedded into WC, so has a stronger topology (cf. Theorem 6.4). 
Altogether (G3)1»1,, 4o large, is a strongly continuous resolvent on WC «x, so we 
can consider its inverse under the Laplace transform (Hille—Yosida theorem) to ob- 
tain the desired semigroup (p;);>0 of operators which are automatically given by 
probability kernels as explained above. Then one checks that p;, t > 0, solves (1.4) 
and is unique under a mild "growth condition" (cf. (2.17) and Proposition 6.7 be- 
low]. Subsequently, we construct a strong Markov process on Xp with weakly 
continuous paths with transition semigroup (p;);>09. By general theory, it then 
solves the Stroock- Varadhan martingale problem corresponding to (L, D), hence, 
it weakly solves SPDE (1.1). We also prove its uniqueness in the set of all Markov 
processes satisfying the mild "growth condition" (2.18) below (cf. Theorem 7.1). 

In comparison to other constructions of semigroups on weighted function spaces 
using locally convex topologies and the concept of bicontinuous semigroups 
(cf. [36] and the references therein), we emphasize that our spaces are (separa- 
ble) Banach spaces so, as spaces with one single norm, are easier to handle. 

In comparison to other constructions of infinite-dimensional Markov processes 
(see, e.g., [43, 52]) where capacitory methods were employed, we would like to 
point out that instead of proving the tightness of capacities, we construct Lyapunov 
functions (which are excessive functions in the sense of potential theory) with 
compact level sets. The advantage is that we obtain pointwise statements for all 
points in X,, not just outside a set of zero capacity. Quite a lot is known about 
the approximating semigroups (pP );.o, that is, the ones corresponding to the 
(GM Jaso, N € N, mentioned above, since they solve classical finite-dimensional 
Kolmogorov equations with regular coefficients. So, our construction also leads to 
a way to "calculate" the solution (p;);+0 of the infinite-dimensional Kolmogorov 
equation (1.3). 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 669 


The organization of this paper is as follows: as already mentioned, in Section 2 
we formulate the precise conditions (A) and (F1) on the diffusion coefficient A 
and the drift F, respectively, and state our main results precisely. In Section 3 
we prove the necessary estimates on IR, uniformly in N, which are needed for 
the finite-dimensional approximation. In Section 4 we introduce another assump- 
tion (F2) on F which is the one we exactly need in the proof, and we show that 
it is weaker than (F1). In Section 5 we collect a few essential properties of our 
weighted function spaces on Xp. In particular, we identify their dual spaces which 
is crucial for our analysis. This part was inspired by [34]. The semigroup of ker- 
nels p;(x, dy), t > 0, x € Xp, is constructed in Section 6, and its uniqueness is 
proved. Here we also prove further regularity properties of pr, t > 0. The latter 
part is not used subsequently in this paper. Section 7 is devoted to constructing 
the process, respectively showing that it is the solution of the martingale problem 
given by L as in (1.2), hence, a weak solution to SPDE (1.1), and that it is unique in 
the mentioned class of Markov processes (see also Lemma A.1 in the Appendix). 
In deterministic language the latter means that we have uniqueness of the flows 
given by solutions of (1.1). The invariant measure u for (p;);>0 is constructed in 
the Appendix by solving the equation L*z = 0. As a consequence of the results 
in the main part of the paper, we get that the closure (L“, Dom(L“)) of (L, D) 
is maximal dissipative on L*(X, u), s € [1, oo) (cf. Remark A.3), that is, strong 
uniqueness holds for (L, D) on L*(X, jz). In particular, the differential form (1.3) 
of (1.4) holds with L^ replacig L and the time derivative taken in L*(X, u). 


2. Notation, conditions and main results. For a c-algebra B on an ar- 
bitrary set E, we denote the space of all bounded (resp. positive) real-valued 
£8-measurable functions by Bp, BY, respectively. If E is equipped with a topol- 
ogy, then BCE) denotes the corresponding Borel o-algebra. The spaces X = 
L?(0, 1) and Hi are as in the Introduction and they are equipped with their usual 
norms | - |? and | - |1,2; so we define, for x : (0, 1) — IR, measurable, 


1 1/p 
kelp = ( [ eorar) ooo, pelo), 
[Xloo := ess sup |x(r)|, 
r €(0,1) 
and define X, := L?((0, 1), dr), p € [1, 00], so X = X2. If x, y € Hå, set 
Ixln2:- Ix'la, œ, yam x, y’), 
where x’ :— Lx is the weak derivative of x. We shall use this notation from now 
on and we also write x” :— d x — Ax. 


Let H^! with norm |- | 1,2 be the dual space of Hè . We always use the contin- 
uous and dense embeddings 


(2.1) HİCX=X' CH, 


670 M. RÓCKNER AND Z. SOBOL 


80 gi (x, y)g-i = (x, y) if x e Hj, y € X. The terms "Borel-measurable" or “mea- 


sure on X, Hè , H^! resp” will below always refer to their respective Borel 
c-algebras, if it is clear on which space we work. We note that since Hj C 
X C H^! continuously, by Kuratowski's theorem, Hi € B(X), X € 8(H-!) 
and B(X) Hj = 8(Hi), 8(H I) n X = B(X). Furthermore, the Borel 
c-algebras on X and Hi corresponding to the respective weak topologies coin- 
cide with 8(X), B(H}), respectively. 

For a function V: X — (0, œo] having weakly compact level sets (V < c], 
c € R+}, we define . 


WCy := [riv < oo] — Ri f is continuous on each (V < R}, RER, 
(2.2) | in the weak topology inherited from X, 


and lim sup WI = o]. 

Roo {vər} V 
equipped with the norm || f ||y :— supyy <00} V-|f|. Obviously, WCy is a Banach 
space with this norm. We are going to consider various choices of V , distinguished 
by respective subindices, namely, we define, for « € (0, oo), 


V(x) = el, x € X, 
(2.3) ^ 1 
Ox. (x) := V(x) + |x |5), XE Ho ; 
and for p > 2, 
Voc (x) =e FRI + [x[D), x € X, 
(2.4) 


Op (x) = Vp Q1 + Ix/ 12 + VeGOKIxIP2', — x e Hd. 


Clearly, (V5, < co} = Xp and (Op, < co} = Hj. Each ©, is extended to a 
function on X by defining it to be equal to +00 on X X H4. Abusing notation, for 
p = 2, we also set V2, :— V, and Q» , :— Ox. For abbreviation, for x € (0, oo), 
p € [2, oo), we set 


(2.5) WCpk C WCvy «> WiCp,k = WCe, ; 
and we also abbreviate the norms correspondingly, 


(2.6) Elp t= vous We ie =U lo and lupe = ll lo, 


All these norms are, of course, well defined for any function on X with values 
in [—oo, oo]. And therefore we shall apply them below not just for functions in 
WCp, or W1Cp «. For p' > p and x’ > x, by restriction, WCp,« is continuously 
and densely embedded into WC, , and into W1 Cp,« (see Corollary 5.6 below), as 
well is the latter into W1 Cy. Vj, will serve as convenient Lyapunov functions 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 671 


for L. Furthermore, © p,« bounds (A — L) Vp,« from below for large enough A, thus, 
©p,« measures the coercivity of L (cf. Lemma 4.6 below). Note that the level sets 
of 8, , are even strongly compact in X. 

We recall that, for Py as in the Introduction, there exists o, € [1, oo) such that 


(2.7) | Pux]p X aplxip for all x e Xp, N EN 
(cf. [40], Section 2c16), of course, with a2 = 1. In particular, 
(2.8) Vic, p oO Py < a Ve p. 


For a function V: X — (1, co], we also define spaces Lip; pi P= 2, k > 0, 
consisting of functions on X which are locally Lipschitz continuous in the norm 
\((—A)~"/2 . 5, 1 € Z4. The respective seminorms are defined as follows: 


+ Ifond-fOrl 
g Vox Vp T TE 
(Dip, No p. (y1) V p. (y2)) I(—A)72(y — y3)h 


(e [0, oo). 


(2.9) 


For | € Z+, we define 


(2.10) Lip; p ={f:Xp > Rill flltip,,. < oo], 


where || f Lip; px = [lf ll puc + (P), pk. When X is of finite dimension, ( f);,p,« is 
a weighted norm of the generalized gradient of f (cf. Lemma 3.6 below). Also, 
(Lip; p Il - Lip, pe) is a Banach space (cf. Lemma 5.7 below) and Lip; p, C 
Lipy pw for l’ <1, p! > p and x’ >x. In this paper we shall mostly deal with the 
case l € (0, 1]. 

Obviously, each f € Lip, ,,, is uniformly |(—A) 1/2 . »-Lipschitz continuous 
on every | - |p-bounded set. In particular, any f € Lip, p « is sequentially weakly 
continuous on Xp, consequently weakly continuous on bounded subsets of Xp. 
Hence, for all p’ € [p, oo), «' € [k, oo), 


(2.11) By(Xp) Lip; p CWC pe! 
and obviously, by restriction, 
(2.12) Br (Xp) N LiPo, pe C W1C p'e. 


Further properties of these function spaces will be studied in Section 5 below. 

Besides the space D := F C defined in the Introduction, other test function 
spaces Dp, «x on X will turn out to be convenient. They are for p € [2, co), k € 
(0, oo) defined as follows: 


-— Dp,« := {u =g o Py|IN EN, g e CR”), 
Well pue + IH Dul2l pc + I| Tr(ADĉ%u)llp, « o]. 


672 M. RÓCKNER AND Z. SOBOL 


Again we set Dy :— JD»... Obviously, Dp, C WCp and Dp C Dy c if 
p! € [p, oo) and x’ € [k, oo). We extend the definition (1.2) of the Kolmogorov 
operator L for all u € FC? = {u=goPy|INEN, ge C?(RN)}. So, L can be 
considered with domain Dp x. 

Now let us collect our precise hypotheses on the terms in SPDE (1.1), re- 
spectively the Kolmogorov operator (1.2). First, we recall that in the entire paper 
A = x" is the Dirichlet Laplacian on (0, 1) and (W,)+>0 is a cylindrical Browninan 
motion on X. Consider the following condition on the map A: X — X: 


(A) A is a nonnegative symmetric linear operator from X to X of trace class such 
that Ay :— Py APN is an invertible operator represented by a diagonal matrix 
on Ey for all N EN. l 

Here Ey, Py are as defined in the Introduction. Furthermore, we set 

(x, Ax) 


(2.14) ag :— 
xeHe\(0} lx"l 


7 [Algi gi, 


where | - |y JH- denotes the usual operator norm on bounded linear operators 
from Hj into its dual H~?. 
Consider the following condititons on the map F : Hj > X: 


(F1) 
(2315) | F(9)- AU ox)r)-é(nx(r), — xeHQ(,D,r € (0,1), 


where V : R — R, ®: (0, 1) x R — R satisfy the following conditions: 
(V) Y eC! !(R) (ie, V is differentiable with locally Lipschitz derivative) 
and there exist C € [0, oo) and a bounded, Borel-measurable function 
w:[0, o0) — [0, oo) vanishing at infinity such that 


Wilco x C+ Vixie(ixl) for dx-a.e. x € R. 


($1) ® is Borel-measurable in the first and continuous in the second vari- 
able and there exists g € L?! (0, 1) with q1 € [2, oo] and q2 € [1, co) 
such that 


[P (r, x)| x g(r) + [x|£) for allr e (0, D, x ER. 
($2) There exist ho, hı € Ll (0, 1), |hi]1 < 2, such that for a.e. r € (0, 1) 
P(r, x)signx x ho(r) + A1(r)|x| for all x eR. 


($3) There exist po € (0, 1], 20 € L (0, 1,21€ LU (0, 1) for some pı € 
[2, oo], and a function w:[0, oo) — [0, oo) as in (V) such that with 
o : (0, 1) x R—R,o(r,x):— AU for a.e. r € (0, 1) 
P(r, y) — È (r, x) x [e0) + e16)lo (r, x)? Pro(o Gr, x))]Q — x) 


for allx, ye R,O< y-— x < po. 





KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 673 


Furthermore, we say that condition (F1--) holds if, in addition to (F1), we have 


(64) d is twice continuously differentiable and there exist g5, 23 € Lo (0, 1), 
£4.85 € LLO, 1), and w:[0, oo) — [0, o0) as in (V) such that, for their 
partial derivatives D,,, Pyr, Px, P,, and with o as in (®3), 





|x|? 
Pry) + «x go o oo 
IM ibi g2 + 93/0 (a) 
and 
|o, o AP 3/2 
q————904d-gsoO ' oo). 


REMARK 2.1. (i) Integrating the inequality in (V) twice, one immediately 
sees that (V) implies that there exist a bounded Borel-measurable function 
@:R,—> R+, @(r) > 0 as r > oo, and C e (0, oo) such that 


(VOCH PAA, POCHP (x) ^ forallx ER. 


(ii) We emphasize that conditions (2), (®3) are one-sided estimates, so that 
($1)-(43) is satisfied if P(r, x) = P(x), r € (0,1), x e R, where P is a polyno- 
mial of odd degree with strictly negative leading coefficient. 

(iii) Under the assumptions in (F1), SPDE (1.1) will not have a strong solution 
in general for all x € X. 

(iv) If ($1) holds, (42) only needs to be checked for x € R such that |x| > R 
for some R e (0, oo). And replacing œw [in (V) and (®3)] by w(r) := sup, e (s), 
we may assume that œw is decreasing. p 

(v) (4) implies that there exists a bounded measurable function à : R} — R+, 
à(r) > 0 as r — co, such that 


Ib. <C +0?) and |e| «gi +C -- o?" (o). 


In particular, (44) implies (43) with pı —2, go(r) = gi1(r) = const. Indeed, we 
have, for x € R,r e (0, 1), 


r X \x r x 
(r, x) = ©, (0, 0) +f o. (s Zs) = ds + | Pr (s Zs) ds. 
0 F r 0 r 


As shown in the previous item, we may assume œw decreasing. Then it follows 
from (64) and Hölder’s inequality that | 


3/2 
lo.10 x) «c 4 A JE ss +(= ) E Go ( P us)e Mas + | gads 


E)" [aoira 


674 M. RÓCKNER AND Z. SOBOL 


=“ m |8212 «(8D "wi f (h)a) 


+lesh (77 BD f. soo| vs 5) as. 


Now observe that 


} 1/2 ] 
æla) i= (2f P (V2ox)e dr) +f 95(s)w(V20./s) ds 


is a bounded measurable function and w(r) — 0 as r — oo. So the first assertion 
follows for r € (0, 1). For the case r € (5, 1), the assertion is proved by the change 
of variables r' — 1 — r. The second assertion is proved similarly. 


In the rest of this paper hypothesis (A) (though repeated in each statement to make 
partial reading possible) will always be assumed. As it is already said in the In- 
troduction, all of our results are proved for general F: Hè —. X under condition 
(F2) [resp. (F24-), or parts thereof], which 1s introduced in Section 4 and which 
is weaker than (F1) [resp. (F1--)]. For the convenience of the reader, we now, 
however, formulate our results for the concrete F given in (2.15), under condition 
(F1) [(F14-) resp.]. For their proofs, we refer to the respective more general results, 
stated and proved in one of the subsequent sections. _ 


THEOREM 2.2 (“Pointwise solutions of the Kolmogorov equations"). Sup- 
pose (A) and (F1) hold. Let ko :— 211 (with ao as in (2.15) and hy as in (2), 
0 < «1 € k* < «o, and let p € [2, oo) N (q2 — 3 + 3, 00) [with qi, q2 as in (91)]). 


Then there exists a semigroup (p;),>0 of probability kernels on X p, independent 
of k*, having the following properties: 


(i) (“Existence”) Let u € De. Then t +> pi(|Lul)(x) is locally Lebesgue inte- 
grable on [0, oo) and 


(2.16) piu(x) — u(x) = L ps(Lu)(x) ds for all x € Xp. 
In particular, for all s € [0, co), 
lim Ps+tu(x) = psu(x) for all x € Xp. 
(ii) There exists A,» € (0, 00) such that 
(2.17) I e ^ p. (0 5,5)(x) ds < oo for all x € Xp. 


(iii) (“Uniqueness”) Let (q;);>0 be a semigroup of probability kernels on Xp 
satisfying (i) with (p;)t>0 replaced by (q;)1>09 and Dy, by D. If, in addition, 
(2.17) holds with (q;):>0 replacing (p;)1>0 for some « € (0, ko) replacing k* , then 
pi(x, dy) = q(x, dy) for all t — 0, x € Xp. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 675 


(iv) (“Regularity”) Let t € (0, 00). Then p; (Wy, x) C Wy cx. Furthermore, let 
f € Lippe 4, 1 485(X) N Wp © D). Then p;f uniquely extends to a contin- 
uous function on X, again denoted by p, f , which is in LiPo 2, V B5(X). Let 
q € [2, 00), x € [K1, k *]. Then there exists 4a. € (0, oo), independent of t and f, 
such that 


lp: f lla, < e M Fla 


and 


(Pt f )0,4,« £ ee! ( fo, o xc 


If moreover, (F1+) holds, then there exists Ae € (0, co), independent of t, such 
that, for all f c Lip; 24; NB (X), 


(Pi f )1,4,« < ean (f). 


PROOF. The assertions follow from Corollary 4.2, Remark 6.6, Proposi- 
tions 6.7, 6.9 and 6.11(1). L 


THEOREM 2.3 [“Martingale and weak solutions to SPDE (1.1)"]. Assume that 
(A) and (F1) hold, and let p, k* be as in Theorem 2.2. 


(i) There exists a conservative strong Markov process M :— (Q, F , (F;):>0, 
(xr)is0; Px)xex,) in Xp with continuous sample paths in the weak topology 
whose transition semigroup is given by (p;),;>0 from Theorem 2.2. In particular, 
for X«» as in Theorem 2.2(11), 


OO 
Ex p e ^59, c (xs) ds. < oo for all x € Xp. 
0 


(ii) (“Existence”) Let x1 € (0, «o — k«*). Then M satisfies the martingale prob- 
lem for (L, Dy), that is, for all u € Dy, and all x € Xp, the function t œ> |Lu(x;)| 
is locally Lebesgue integrable on [O, oo) P,-a.s. and under Py, 


t 
u(x) — u(x) — Í Lu(x;) ds, t>0, 


is an (F;):>0-martingale starting at O (cf. [54 ]). 

(iii) (“Uniqueness”) M is unique among all conservative (not necessarily 
strong) Markov processes M! := (X, F’, (F t20, (piso. Pi) xex,) with weakly 
continuous sample paths in X, satisfying the martingale problem for (L, D) 
[as specified in (i1) with D replacing D,,| and having the additional property 
that, for some x € (0, ko), there exists A4 € (0, oo) such that 


oo 
(2.18) E. l i ees (© po) (x$) ds! < oo for all x € Xp. 


(iv) If p > 295 — 6 + 4/q1, then M weakly solves SPDE (1.1). 


676 M. RÓCKNER AND Z. SOBOL 
PROOF. Corollary 4.2, Remark 6.6, Theorem 7.1 and Remark 7.2 below. C 


THEOREM 2.4 (“Invariant measure"). Assume that (A) and (F1) hold. Let 
p,«* be as in Theorem 2.2. 


(i) There exists a probability measure u on Hj which is “L-infinitesmally 
invariant,” that is, Lu € L! (Hj , 4) and 


(2.19) J hides  SoralluedD 
(L* y, = 0 for short). Furthermore, 


(2.20) J Op cx du « 00. 


(ii) u, extended by zero to all of Xp, is (pi)rso-invariant, that is, for all 
f:X — R, bounded, measurable, and all t > 0, 


fotau=f fan 


[with (pi)i.o from Theorem 2.2]. In particular, u is a stationary measure for the 
Markov process M from Theorem 2.3. 


PROOF. Seethe Appendix. LJ] 


3. Finite-dimensional approximation: uniform estimates. In this section 
we study finite-dimensional approximation of (1.2)-(1.3). The results will be used 
in an essential way below. 

The main result of this section is Proposition 3.4, giving estimates on the resol- 
vent, including its gradients associated with the approximation Ly of our opera- 
tor L on Ey [cf. (3.3) below], but these estimates are uniform with respect to N. 
As a preparation, we need several results of which the second (i.e., an appropriate 
version of a weak maximum principle) is completely standard. Nevertheless, we 
include the proof for the convenience of the reader. 

Below, the background space is the Euclidean space RF", N EN, with the 
Euclidean inner product denoted by (.,-), dx denotes the Lebesgue measure on 
R” and LP (RP), WP" (RP, r e NU {0}, p € [1, co] the corresponding L? and 
local Sobolev spaces, respectively. 


PROPOSITION 3.1. Let A: RF — R be a symmetric strictly positive def- 
inite linear operator (matrix), F :IRN — RN be a bounded measurable vector 


field, X, :— sup,cgw (GA "Poo Q4 "FQ , p € Li (RY) be strictly positive and locally 
Lipschitz and W € L&R"), W 0. Let 
Lu :— p^! div(ApDu) + (F, Du) = Tr AD^u + (p ! ADp + F, Du) — Wu, 
ue We (RP). 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 677 


Then there exists a unique sub-Markovian pseudo-resolvent (£3),..9 on L?? (RY), 
that is, a family of operators satisfying the first resolvent equation, which is 
Markovian if W = 0, such that: 


(a) Range(R,) C Dom :— {u € (1 «o9 Wig? (RY )\u, Lu € L9 (RN)) and 
(X — L)R, — id for all X > 0. 


(b) For all à > À, and f € LO (RY), one has |D& f| € L^(R" , pdx). 
(c) For all f € LO (RP), one has Jim A f = f in L^(RN,, p dx). 
“OO 


Hence, in particular, Raf for f € L^» (RN) has a continuous d x-version, as have 
its first weak derivatives, and for the continuous versions of Ry f , à > 0, the resol- 
vent equation holds pointwise on all of R^ . If both f and F above are in addition 


locally Lipschitz, then Rif € (1,55 Wo? (IR) for every à > 0, hence, its con- 
tinuous dx-version is in C^ (RP). 


PROOF. Consider the following bi-linear form (€, D(&)) in L?(R , p dx): 


Bac) E _[(Du, ADv) — (F, Du)v + Wuv]p dx, 


D(8) :— li € Wee RP) | [2 + Du + Wu" |p dx < oo}, 
RY 
Since, for all u, v e D(&), 
(3.1) Iu(F, Dv)| < Dv, ADv)| +Ax|ul’, 


it follows that € > —A,. Then it is easy to show that (€ + A.(., -), D(&)) is 
a Dirichlet form (cf. [43], Section L.4., i.e., a closed sectorial Markovian from) 
on L? (RY, p dx). Hence, there exists an associated sub-Markovian strongly con- 
tinuous resolvent (K)),>,, and semigroup (Pj); o on L? (RY, pdx) (cf. ibid.). 
Note that 1 € D(8) and &(1,v) = 0 for all v e D(&), provided W = 0, so 
(Ry)a>a, and (Pj);.o are even Markovian in this case. In particular, assertion (b) 
holds. Note that, for a bounded f € L?(R, p dx), we can define 


x) 
& f := Í e ^ P, f dt 


even for all A > 0 instead of à > A,. Here, the L? (R^, p dx)-valued intregral is 
taken in the sense of Bochner. Then AR; f — 48 f pui f in L? (R, p dx) and 
(Ryx)a>0 is a sub-Markovian pseudo-resolvent on L**(«RP). In particular, the first 
resolvent equation and assertion (c) hold. 

To show (a), we first note that, for A > A, and f € L*9 (R^), the bounded func- 
tion u :— A; f is a weak solution to the equation 


Au — Lu = àu — a div(ApDu) — (F, Du) + Wu = f in RN. 


678 M. RÓCKNER AND Z. SOBOL 


Hence, it follows from [28], Theorem 8.8, that u € WZ? (RN). Then [28], 


loc 


Lemma 9.16, yields that u € Dom. Thus, Range(R,) C Dom, provided X > Àx. 
Now let A € (0, àx]. Then for all A’ > Ax, Raf = Ry f --/! —A)&y Ry f. Hence, 
Rif € Dom and (A — D) f = f+ A’ —-ARif. So, (4 — L), f = f. The 
last part follows by Sobolev embedding. O 


LEMMA 3.2. Let A:R“ — R be a symmetric strictly positive definite linear | 
operator (matrix), F : RN — RF be a bounded measurable vector field, X, :— 


sup, cg OA“) |» > 0 be locally Lipschitz and W € L.R”), W >0. 


loc 
For à > Àx, letu € Wier (RN )n L?(RN ,p dx) be a weak super-solution to the 
equation 


Àu—p ldiv(ApDu) —(F,Du)+Wu=0 3 on RP 
[i.e., a weak solution to the inequality Xu — p ldiv(ApDu) —(F, Du) - Wu > 0]. 
Then u > 0. 


PROOF. ForÓ c C. (R^), choose up as a test function. Then, using the 
fact that u* A u~ = 0, we obtain that, for all £ > 0, 


0z-— [Io + W)(u- 0» + (D(u^ 6^, ADu )— u 0*(F, Du )|o dx 
Se J [A + W)(u-0Y? + (Du70), AD(u70)) — u70 (F, D(u-0))] pdx 


ds | (u-Y*[(D0, AD@) — €(F, D0)]o dx 


1 
~p\2 -2 
<~ f A= AHE) 0) pdx+ | (wu ) (1+2) (D0, AD®)padx, 


where we used the fact that W > 0, € > —A, and we applied (3.1) with £0, +8 
replacing u, v, respectively. Hence, for all e > Q, 


(4 — (1-- 8)A,) [anor dx < (1 4 =) J «os. AD0)p dx. 


Now we choose & > 0 such that à > (1 + ©)A, and let 0 71 and DO — 0 such 
that (D8, AD0) < Ca. Then the dominated convergence theorem yields u~ = 0. 
[] 


COROLLARY 3.3. Let A:R > R be a symmetric strictly positive definite 
linear operator (matrix), F : RN — RN be a bounded measurable vector field, 


Ax {== SUP, eR QA TO) p > 0 be locally Lipschitz and W € Le (RN ). 
Let V € C?(RY), V > 1 be such that, for some Ay € R, 


(3.2) AyV —p ^! div(ApDV) — (F, DV) --WV >= 0. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 679 
Let f € L?(RN, pdx), V! f € LORY), A> A, tay andu e WR (RY) N 
L?(RP, p dx) be a weak sub-solution to the equation 
Au — p^ div(Ap Du) — (F, Du) + Wu = f on RN 


[i.e., a weak solution to the inequality Au — p idiv(ApDu) —(F, Du)+ Wu < f]. 
Then | 


1 
vo! < 
|l Ulloo < EE 





IV fiio: 


PROOF. LetW:—V-l[AyV — p^! div(ApDV) — (F, DV) + WV] and v := 
Vu. It is easy to see that v € WE (RY )n LRN, V? pdx) and it is a weak 
sub-solution to the equation 


; 1 xc 
(A — Ay)v — Vip div(AV?pDv) —(F,Dv)+Wv=V"'f — on RY. 


Note that V7! f e L?(IRN, V? pdx). Since W > 0, the result now follows from 
Lemma 3.2 and the fact that the resolvent associated on L? (RY , V? p dx) with the 
bi-linear form 


&(g, h) = Í _[(Dg, ADR) — (F, Dg)h + Ŵgh]V?pdx, 


D(8):— {2 e Wor (RY ) [ i [g? + IDgl? + Wg?]V2pdx < oo}, 
is sub-Markovian. [1] 


PROPOSITION 3.4. Let A, H : «RF — R” be symmetric strictly positive def- 
inite linear operators (matrices) such. that AH = HA. Let F:RN — RY be a 
bounded locally Lipschitz vector field. Let 


— Lu(x) = Tr(AD?u)(x) + (—Hx + F(x), Du(x)), 
(3.3) 2,1 
uc W^ R^), x eR”. 


loc 


Let V : RN — R“ be a symmetric nondegenerate linear operator (matrix) such 
that TH = HT. Assume the following: 


(i) there exists Vy € C? (R™), Vo > 1 and Avy € R such that 
(3.4) (Ay, — L) Vo > 0; 
(ii) there exists Vy € C? (R), V; > 1 and Ay, € R such that 
(Ay, - L- WV1i =0 


(3.5) 
with W(x) := sup [((DF(x)Py, T71y) - ]HV2ypP], x e RN. 
Iylzzl 


680 M. RÓCKNER AND Z. SOBOL 


Then: 


(i) there exists a unique Markovian pseudo-resolvent (£.5),.09 on LO (RN) 
such that 


Range(R,) C l € () WEP RY )\u, Lue eam], 


p«oo 


(A — L) = id for all X > 0, and AR, f — f as X — oo pointwise on R for 
bounded locally Lipschitz f; 
(ii) for a bounded locally Lipschitz f, we have 


1 








~1 -1 
(3.6) Vo Raf llo < X A l Vo foo 
for all à. > ày; and 
1 
(3.7) sup Vj HIL DR fl(x) < essup V, |I Df |(x) 
X À— Ày x 


for all à > Xy, provided V, IDF! € L™ (RP) and |Df| e LR”, pdx). Here 
DR) f and R) f denote the (unique) continuous d x-versions of DR) f , Rif , re- 
spectively, which exist by assertion (1) and p(x) :— exp(-i(x, A Hx), x ER. 


To prove Proposition 3.4, we need another lemma. 


LEMMA 3.5. Let A, H : RN — R be symmetric strictly positive definite lin- 
ear operators (matrices) such that AH = HA. Let F:RN — RN be a bounded 
locally Lipschitz vector field. Let L be defined as in (3.3). 

Let T :iRN — R” be a symmetric nondegenerate linear operator (matrix) such 
that VH = HF. 

Let X € R, f be locally Lipschitz and u € Wo (RM) be a weak solution to the 
equation (A — L)u = f on RN. 

Then u € (|p zoo WP (RY) and v := |I Du] is a weak sub-solution to the equa- 
tion 

(X — L —W)v «x |rDf| 
with W (x) :— sup [(DF(x)ry, T1 y) — |H!?y?], x e RY. 


Iy[z1 


PROOF. Throughout the proof let (f,g) stand for fen f(x)g(x)dx or 
fen (f(x), g(x)) dx whenever fg € L'(R%,dx) or (f,g) € L'(R”, dx), for 
f,g:R" > R or f, g: RF — RY measurable, nm, m = 1,..., N be the (com- 
mon) orthonormal eigenbasis for H and I, D, = ysnm, m = 1,..., N. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 681 


By [28], Theorem 8.8 and Lemma 9.16, u € (1, Wig? (RY). For m = 


1,..., N, let tm :— 0,,u. Then, for a bounded $ € WE RY ), integration by parts 
yields 


(Dum, ADO) = —(Aum — Om f — (—H + F, Dum) — (— Hg + Om F, Du), 9). 
Set [Du] := |I Du] and, for £ > 0, [DuJe := / |I Du|? + e. For 6 € C?" (RN) and 


m= 1,...,dim E, choose dy :— lue un 2^ 0. Then Ém is bounded and 
Déby, — Inm pg q Drm Dun Yn um D'uT" Du 
[Du]; [Du]; [Du]; 
with |Dós| e (] LPR”, dx). 
p«oo 


Hence, a.e. on R^, 


X (Dum, AD@m) 
== ([Du]s, AD@) 


Du [Du 0 
E |n T D^uAD?ur — ( TD uaD ur ) |. 
[Du]e [Du]e / 1[Du]e 


Since D([Du],) = 24°24, it follows that vs := [Du] is a weak solution of the 


equation (A — L — - = Gs, where 





(-1H!?T Du + (T Du, '(DF) Du)) 





W, = 
Hd x IP 
and 


E I Du 
Miei T (i, Pf) 

















Du Du 
M Teir D?u A D?ur} — ( T D?uAD?ur )}. 
il | Dal [Du], 


We have Wẹ < W a.e. so ve is a weak sub-solution to the equation (A — L — W)v = 
Ge. vasa? to the limit as € — 0, we see that v; = [Du], converges to v = [Du] 
in Wi ?(RN ) and, thus, the assertion follows. |] 


PROOF OF PROPOSITION 3.4. Note that, provided AH = H A, we have 
Lu =p div(ApDu) + (F, Du), 


where p(x) = exp(—i(x, A lHx)). Hence, Proposition 3.1(a) implies asser- 
tion (i) except for the fact that AR, f — f pointwise as A — oo, which we shall 
prove at the end. 


682 M. RÓCKNER AND Z. SOBOL 


Let f :IR" — R be bounded and locally Lipschitz and such that 
V !DDf|eL*?(R') and u:— &f. 


By assertion (i), u is a weak solution of the equation (A — L)u = f on RN 
and, by Lemma 3.5, v := |U Du| is a weak sub-solution to the equation (A — 
L — W)v = |-Df| on RY with W as in Lemma 3.5. Let first A > Ay + 
àv V Ay,. Note that u € L?(R”, pdx) and v € L7(R%, pdx) by Proposi- 
tion 3.1(b). Then (3.6)-(3.7) follow from assumptions (i) and (ii) and Corol- 
lary 3.3, since f, |T DF] e L^(RF, pdx). 

By density, for A > Ax + Ay, V Ay,, the operator R, can be continuously 
extended to the completion of the bounded locally Lipschitz functions on RV 
with respect to || y - leo, preserving the resolvent identity and estimate (3.6). 
Moreover, for a locally Lipschitz f such that Vl f. y In Df| € LO (RY) and 
|Df| e L^(R", p dx), estimate (3.7) holds. This is easy to see by replacing f by 
Cf V (—v)) ^ n and letting n — oo. Now, for A € (Ay, Ax + Ay, V Ay], one can 
define 


OO 


(3.8) Rr. = Y o — ay RE, 
k=l 


with some Ag > Àx + Ay, V Ay,. The series converges in operator norm due to (3.6) 
and (3.6) is preserved: 





-1 = pistas A AoA r1 
IVW Ra fllo < 3.00 23) Rid loo < X} TIVO Foo 
bel te (Ao — Av) 
1 2d 
= Tay Mo Slee 


On L** (R^), obviously R defined in (3.8) coincides with R, defined in Propo- 
sition 3.1 with W = 0. So, AR, remains Markovian for A € (Ày), Ax + Ay, V Ay, ]. 
By similar arguments, using the closability of I D, we prove that (3.7) is preserved 
for A € (Ay,, Ax + Àv V Av]. 

We are left to prove that AR, f — f pointwise on R” as A — oo, for any 
bounded locally Lipschitz f. The proof is by contradiction. Let x9 € R such that 
for some subsequence A, — oo and some e € (0, 1], 


(3.9) lAnRa, f (x0) — f(xo)) >e WneN. 


Selecting another subsequence if necessary, by Proposition 3.1(c), we may assume 
that the complement of the set 


M := {x c RN | lim. Anti, f (x) = œl 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 683 


in R has Lebesgue measure zero, so M is dense in R”. By (3.7), the sequence 
(An Ri, f nen is equicontinuous and converges on the dense set M to the contin- 
uous function f, hence, it must converge everywhere on R^ to f. This contra- 
dicts (3.9). L 


LEMMA 3.6. Let V:RY — [1,00) be convex (hence, continuous) and let 
T:R“ — RY be a symmetric invertible linear operator (matrix) and f : RN + R 
be locally Lipschitz. Then 


7 1 If D — f O9) 
3.10 V-!IT Df|llos = Vinny WOO IPC cM! 
(3.10 IV DiI x SCR VODY VOD [P711 — y) 


PROOF. We may assume that f € C! (RY). The general case follows by ap- 
d" Let x € RF. Then we have 


— Ir Df (x)| = l f — fO») 


ee Vin) V VOo O — y2)I 
yi y2 ERN 


On the other hand, for y1, y; c RY, 
1 If D — fOD) 
VOD v Vo) [P7161 — »2l 
_ 1 
~ Vin) v VQ) 


m 





1 
x Í (FDf(zyi + — t)y2), P7! (2 — yDIP O — yp t) dt 


< IV Ir Dfllloo, 


where we used that V(ty; + (1 — t)y2) < V(yi) V V(yz), since V is convex. 
Hence, the assertion follows. |] 


REMARK 3.7. We note that if the right-hand side of (3.10) is finite, then f is 
Lipschitz on the level sets of V. 


4. Approximation and condition (F2). In this section we construct a se- 
quence Fy: Ey — En, N €N, of bounded locally Lipschitz continuous vec- 
tor fields approximating the nonlinear drift F. The corresponding operators Ly, 
N EN, are of the form 


Lyu(x) i= + Tr(Ay D7u)(x) + (x” + Fy (x), Du(x)), 
u € WI (En), x € EN, NEN, 


whose resolvents (G0?) 10, lifted to Xp, will be shown in Section 6 to converge 
weakly to the resolvent of L. 


We introduce the following condition for a map F: Hi — X: 


(4.1) 


684 M. RÓCKNER AND Z. SOBOL 


(F2) For every k € N, the map F® := (F, nx): Hi — R is | - [-continuous on 
| - [1,2-balls and there exists a sequence Fy: Ey > Ey, N EN, of bounded 
locally Lipschitz continuous vector fields satisfying the following condi- 
tions: 

(F2a) There exist «o € (0, à] and a set Qreg C [2, oo) such that 2 € Oreg 
and for all x € (0, xo), q € Qreg, there exist mg. > 0 and Àg, € R 
such that for all N e N, 
(4.2) Ly Va. = Ly (Vq,« lew) X Àq,k Va," — Mq, Og, on Ex. 


(F2b) For all € € (0, 1), there exists C; € (0, oo) such that for all N € N and 
dx-a.e. x € En (where dx denotes Lebesgue measure on Ey) 
(DFy(x)y,y) <ly'g+ lx’ 5 + Coly — Vy € Ew. 


(F2c) limy-soo | Py F — Fy o Py|o(x) 20 Vx € Hd. 
(F2d) For xo and Qyeg as in (F2a), there exist x € (0, xo), p € Qreg such that, 
for some Cp. > 0 and some o: [0, oo) — [0, 1] vanishing at infinity, 


[Fy o Pylo(x) € Cp Op, (xX)o(Op.(x) Vee Hy, NEN. 
Furthermore, we say that condition (F24-) holds if, in addition, to (F2) we have: 


(F2e) For all e e (0, 1), there exists C, € (0, co} such that, for all N € N and 
dx-a.e. x € En, 


(D Fy G)C- A)? y, (CA) y) < |y + lx" I + Cly 
V y € En. 
The main result of this section is the following: 


PROPOSITION 4.1. Let F be as in (2.15) and let assumptions (V), (41)-(43) 
be satisfied. Then (F2) holds. More precisely, (F2a) holds with ko :— ZEE 


8ao 
Qreg := [2, 00), (F2c) holds uniformly on Hq-balls, and (F2d) holds with 
p € [2, œ) N (q2 —3 + oo) and any k € (0, ko). If, in addition, (44) is sat- 
isfied, then (F2+-) holds. 


To prove our main results formulated in Section 2, we shall only use conditions 
(F2), (F2+), respectively. Before we prove Proposition 4.1, as a motivation, we 
shall prove that (F2) [in fact, even only (F2a)-(F2c)] and (F2e) will imply reg- 
ularity and convergence (see also Theorem 6.4 below) of the above mentioned 
resolvents (GN Ja>0- 


COROLLARY 4.2. Let (A) and (F2a)-(F2c) hold and let Ly be as in (4.1) 
with Fw as in (F2). Let (RM) a>o be the corresponding Markovian pseudo- 
resolvent on LO (Ey) from Proposition 3.1. For a bounded Borel measurable 
f:X — R, we define 


GLY f= (RY (f hey) o Pr- 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 685 


Then AGU? is Markovian and aG f — f o Py pointwise as à — œ for all 
bounded f which are locally Lipschitz on Ey. 

Let ko, Oreg be as in (F2a) and let k € (0, ko), q € Qreg with A4, as in (F2a). 
Set Nak = aye + Cmq x» With ma 4 as in (F2a) and function € +> C; as in (F2b). 
Let N € Nand f € Lipo, ,., f bounded. Then 





(43) |Ge f| « Vou GNxX)Mflaes — x€Xqo À> Age 


* 


and for y, y2 € Xs, 


CP Fon — Gf? fon) _ 161? fo — GY fo 
lyi — yale = IPN (yi — yale 
Va (PN yt) V Va c (Pn yo) 
A Rh ek EN Sai alee 
(4.4) < an o4. 


/ 
A> Àg 


In particular, if X > À ok V hg, then Go? f € [\e>0 Ds, ke and, provided 
fe, GO? y E (eso De. Furthermore, for all x € Hy ASN 


q,k? 
I. — 1369? f(x) — (f o Pw)(x)| 
(4.5) 





< m UN [PnP — Fu o Py |2(x)or Va, Gf 00,4,«- 


In suena for all A, > Àg, ET 


: (m 1 
(4.6) im, : sup Ala - ngo? r—f(a)20  vxeH. 

If, moreover, (F2e) iE let Ree = Age t+ Cig, With m, as in (F2a) and 
function £ > Ce as in (F2e). Then, for N € N and f € Lip, 4, f bounded, we 
have, for y1, y2 € Xq, 

(N) ex) 

IG; fo G) fO) 2 Vo,n(Pn yt) V Vo (PN Y2) 

(=A) ydh 7 AA k 


PROOF. To prove o (4.4) and (4.7), fix x € X4. By (F2a), we can apply 
Proposition 3.4 with Vo :— V, «| gy to conclude that, for A > A, 


GC? F| = [REP (Frey) (Pv) 


(4.7) (fF) 


1 " 
< —— VaL (PNX) mp VL 0f Or! 
À a Ag, ycE 


< —— — V, x (Pyx) sup VOIO), 
À d ha, yeX 


686 M. RÓCKNER AND Z. SOBOL 


which proves (4.3). By (F2a), (F2b), respectively, (F2a), (F2e), we can apply 
Proposition 3.4 with V; :— V, «| p, to conclude that, for A > Ao :— Àg, y Of Àg, x if 
I := 0, respectively, / :— 1 and all yi, y? € XQ, 


IG f (yj) — 6 FOr)! 
I(—A)77 — ya) le 


«ARI F bey) Puy) — T 0 Ley Pw ya) 
T I-A) (Py yi — Puy2)l2 


< Vane (Pn yi) V Va. (PN ya) pup Vre Aa (DRI f 1g) O)h 
YEEN 


r Vo PN y1) V Vg, (Pn y2) 
7 À — Ag 
where we used both Proposition 3.4 and Lemma 3.6 in the last two steps. We 
note that, by our assumption on «o in (F2a), we really have that |Df leyl € 
L*(Ey, p dx), so the conditions to have (3.7) are indeed fulfilled. 


By the last part of Proposition 3.1, we have that u :— RM Flay € C?(Ey) and 
that 


(4.8) Au(x) — Lyu(x) = f(x) Vx € Ey. 


Hence, it follows from (4.3), (4.4), Lemma 3.6 and (2.8) that Gi" ) f EM es0Pp,n+e 
and, provided f € D, that Go? f € Meso De. Furthermore, (4.8) implies that, 
on Hi, 


(ugue: 


(A — D((RI? f Ig) o Pw) — f o P| 
= |(Py F — Fy o Py, D(AU? f 1g) o Py)| 


< aa NF — Fy o Pylo(f)0,q," Vq,« 9 PN, 

-M k 
where we used (4.4) and Lemma 3.6. Now (4.5) follows by (2.8) and (4.6) follows 
by (F2c). O 


Now we turn to the proof of Proposition 4.1, which will be the consequence of 
a number of lemmas which we state and prove first. 

In the rest of this section, $ : (0, 1) x R > R will be a function square integrable 
in the first variable locally uniformly in the second and continuous in the second 
variable, and y € C! (IR). For such functions, we define 


(4.9) Fy(x):$4(.x(), | GyQO:—x y ox, x, ye He. 
Note that Fy: H} > X and Gy : Hd > X. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 687 


LEMMA 4.3. Let wy satisfy (Y), and 0 € C (-1,1), 0x0 <1, 
0 |[-1/2,1/3] = 1. For N EN, let WN) (x) .— V (8G). xcR. 

Then for N EN, y? e CI (R) satisfying (Y) uniformly in N, that is, with 
some C > 0 and à: R+ > R,,óo(r)— Qasr > œ, ma penaent of N. Moreover, 
[Gy — Gy|2 — 0 as N — œ uniformly on balls in Hj. 


PROOF. Let, for x € R, 0j(x) :— x0'(x) and 62(x) :— x^0" (x). Then 

X (x) = V Q0) + 25:609. (4) + ¥Po(%). Hence, the first assertion 
follows from Remark 2. 16). 

Note that yy) (x) = w(x) whenever |x| < A. Hence, the second assertion fol- 
lows. LJ 


LEMMA 4.4. Let0 € C??(R), odd, 0 x 0' x 1, 0(x) =x for x € [—1, 1] and 
6(x) = 2 sign(x) for x € R \ [—2, 2]. 

For N €N, let 0y (x) := NO(N7!x), x eR and oy := On of. 

Then for all N € N, oy is a bounded function. 

If ġ satisfies (P1)-(®3), then so does dy, N € N, with the same qo > 1 and 
functions g, ho, hi, 80, '81 and c. Moreover, |Fg — Fay|l2 — 0 as N — oo uni- 
formly on balls in Hj A 

If, in addition, $ satisfies (4), then oy is twice continuously differentiable and 


1/2 
|92 Øn (r, x)| < coga(r) + cog3(r) o( r=): 
ré(0,1),x ER, 








X 
"Ir 
and 
> 3/2 lx | 
192. by (r, x)| < cogar) + cogs(r) o( =), 
ré(0,1),x ER, 








"XC 
r(l—r) 


with cg :— 1 v sup, £^|9"(£)]..X. 


PROOF. The first assertion is obvious. Then, given that $ satisfies («5 1), (452), 
so does @y since Oy is an odd contraction. Note that 8y (y) — On (E) < 0 whenever 
n «é and 0 x 8n (n) — Ou (E) < n — E for n > E. So, (3) holds also for jy if it 
holds for $. To prove the next assertion, we note that, since Oy (x) =x if |x| < N, 
for x € Hj, condition (61) implies that {g(1 + |xI2) x N} C (66,x() = 

‘on C, x C))). Hence, again by (1), 


Fe — Fay) = [lor x0) ow (rx) ar 


( q242 2 
s40 Ix B» J Les n/a 0) d", 


688 M. RÓCKNER AND Z. SOBOL 


which converges to zero as N — oo uniformly for x in any ball in Hj. 
Finally, the last assertion follows from the following identities: with 02 (£) :— 


£0" (E), 





; dx)? 
dre On = Oy o 0)07,6 + Lipjan) (6 o £) T L 
0,0 8, 
arn = (Oy 09) 82,6 + Ligin) (4.0 o £) - d T 


LEMMA 4.5. Letó € CX ((—1, 1)), nonnegative, even, and f 8(x) dx = 1. For 
Be(0,1),x €R,r € (0, 1), let 


óg(r,x) :— 


pa arra 


and 
pelr, x) :— Í, ó(r, x — y)ôg(r, y) dy. 


Then g(r, -) € C??(R) for all r € (0, 1). 
If Q is bounded, then, for B € (0,1), n —0,1,2,..., xeRR and r € (0,1), 








a" fx 18 I(y) dy 
Ayn bp|(r,x) < loot Teta)" 





If ġ satisfies (1)—(43), then dg, B € (0, 1), does so, with the same q; € [2, oo] 
and q» € [1, oo) and functions hy and g; and g! — 242*g, hh) = hi +ho 4-222 *?g, 
go = go + 9(sup, o(r))g1, and œ' (r) :— 2 sup{w(s)|s -5,r0. 

Moreover, | Fs — Faslo(x) > 0 as B — 0 uniformly on balls in Hè. 


PROOF. The first two assertions are well-known properties of the convolution. 
By (1), for all B € (0,1), x € R andr € (0, 1), 


légtr, OI € ar) [ (1+ lx — yl?) 8g(r, y) dy 
< 2 g(0)(1 +x + (8/7 —5)? f iyrmso» dy). 


So, all dg, B € (0, 1), satisfy (41) with g’ = 247*15. 

By Remark 2.1(iv), since $g satisfy (1) uniformly in f € (0, 1), it suf- 
fices to verify ($2) for all x € R, |x| > 1. Then sign(x — y) = sign(x) for all 
y € Ug,re(o,1) Supp ôg (Cr, -) C (—1, 1), B € (0, 1). Since ¢ satisfies (2), for a.e. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 689 
r € (0,1), all x € R, |x] > 1, B € (0, D), we obtain 
$a (r, x) sign(x) = Í, sign(x — y)o(r, E y) dy 


< ho(r) +hı(r) [ lx — yldg(r, y) dy 


< hot) - (I Br f 17180)dy). 
Hence, og, B € (0,1), satisfy (2) with the same hı as $ does and with 


hi = hı + ho 4 222**g. 
Set £(r, x) :5 — =, x E R, r e (0,1). By (#3), for all p € (0, po), x € R, 


NEN, Be (0, 1),7 € (0, 1), 


1 
"n (r, x + p) — ég(r, x)) 
1 
=- [ ($r, x +p —y) —b(r, x — y))8p(7, y) dy 
< gor) + ei] Ie x -DP Vole. x — y)l)5p(r, y) dy 


= go(r) + g1(r) Í j£, x) — By" PLo(I(r, x) — Byl)5(y) dy. 


By Remark 2.1(iv), we may assume w nonincreasing, by replacing c with @(r) :— 
sup... (s). Then, for |£(r, x)| < 2, 


Í i&(r, x) — By? VP o(I&r, x) — Byl)8(y) dy x 9w(0), 


and, for |£(r, x)| > 2, ZIE, X)| < IẸ(r, x) — Byl < 31&@, x)|, provided |y| < 1, 
hence, 


J ECx- By? ole, x) - Byl)80) dy < (IEW 1 GE). 


Thus, dg, B € (0, 1), satisfy ($3) with the same gı as $ does, and with gp = 
go + 9:0(0)g; and o (r) := 295), r € R4. 

Finally, to prove the last assertion, we first note that, for all x € HÀ and 
B € (0, 1), 


2 l 2 
[Fo — Fah = [. 16r. x0) — or x dr 


1 
«| sup 190» = pl, ydr 
0 yeR 


Iy|xix'l2 


690 M. RÓCKNER AND Z. SOBOL 


But óg! (r, y) > $(r, y) as B > O locally uitiformly i in y for all r c (0, 1 and, since 
‘we have seen that each $g satisfies ($1) with 22+! 9 and q2, we also have that the 
integrand is bounded by 


222r e (9*1 + (Ix^]o + 1%)’. 


Therefore, the last assertion follows by Lebesgue's dominated convergence theo- 
rem. [] 


LEMMA 4.6. Define, for N €N, u € Wi (En), 
La yu(x) := $ Tr(Ay D^u)(x) + (x" + FŒ) + Gy(x), Du(x)), — x € Ey. 
Assume that ($2) holds. Let ko :— uh For k € (0, xo), let Ay :— 2k TrA + 


ult. Then 

(4.10) Lo, y Ve := Low (Velg,) SAeVe on En, 

and, for all X > 244, 

(4.11) Low Vie SAVe —mei8. on Ey, 

with 

(4.12) men t= min( Z, 2k — |hilik — ee m saoe?) (> 0). 


Moreover, for all q € [2, 00) and k € (0, xo), there exist àq, > 2X, and 
mq < min(g(q — 1), mx,,} depending only on q, x, |hoh, |hili; |Alx x and . 
Tr A such that 


(4.13) Lg,y Va, = Lo p (Vac ley) S Aq Vae — qq on Ey. 
PROOF. First observe that, due to (2), for all q € [2, 06) and x € Hå, 


a | 1 
(4.14) (Fy(x), x|x|17®) < Í (hy |x|? + holx|?—!) dr < hililxI£, + Iholilxl£; ! 
and 
fl 
Gy), ale!) =-@—1) fx! oxdr 
(4.15) | "s i 
——(q-— D) YT)? dt — 0, 
x(0) 


since x(1) =x(0)= 
To prove the first din note that, forx € Ey, i, j — 1,..., N, 


(4.16) dilz = 2x, m) and 92|xf = 2i. nj) = 28. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 691 
So, we have, for x € Ey by (4.15) with q — 2, 
(417) Loy Vc (x) = 2e (Tr Ay + (Fo(x), x) + 2 (x, Ax) — Ix l2). 
Now (4.14) for q = 2, together with the estimates |x|oo < Jw and the in- 
equality ab < 2ea* + E a, b,£ > O, imply that, for all e > Oand x € H}, 


holf 
(FŒ), x) < (hil + e)ix' ls + —+ ar —À., 
hence, 
Tr Aw + (Fo(x), x) + 2« (x, Ax) — |x’ 
h 1 
<TrA+ of — (1 — jh — E - 2«ap Js 


So, (4.10) follows by choosing ¢ > 0 so that the last term in brackets is equal to 
zero. Equation (4.11) follows by choosing € > 0 so that 


Holt) — À 
2K( TrA ; 
«( rAt+ a 5 


To prove the second assertion, observe that, for x € Ey, i, j — 1,..., N, 
8ilxlZ = aGrlxIt 7, mi), 
az |xI — q(q — D xI* ni. ny), 
aj Gclx 77, mi) = (q — DxI* ^, ning), 
(x12117, x^) = —(q — Dp t1 p. 
So by (4.15), we have, for x € Ew, 
Lo, y Va, (X) = (1 + [x 12) Lg, y Vie (X) 
(4.19) + qe ATUS x), 1x14 72x) + A (Ax, |x 72x] 


N 
2 “~~ EE 
t q(q — pet (rss ^ Yi) - |x’ 1x12 g 


i=] 
It follows from (4.11) that, for all à > 2àx, x € En, 
(4.20) (1+ IxlD Le, y Vc (x) € Va). — me (x È+ D). 


Below we shall use the following consequence of the inequality Izl? < 2|z'l2lz|2, 
z € Hi: For x € Hy and g > 2, 


(4.18) 


klg = bell? [5 < 2177) xl, 


(4.21) Bn 
-qix'ixit?7tLixig7. 


692 M. RÓCKNER AND Z. SOBOL 


It follows by (4.14) and (4.21), together with Young’s inequality, that there ex- 
ists cı (q) > 0 depending only on q, such that, for all € > 0, 


(F(x), Ix|4 7x) 
< [hi lial + [hohilxifg! 
< qlhihi|x x71 ix a? 


iod) (q—1)/ 
de g4-YD/ holi Fags di be Tixg- U^ 


< e|x'x 771p 


+e (g) (hife! + nort Ps «-D/etya + pe. 


It follows from the estimate |z| p < [Zloo, (4.21) and Young's inequality that, for 
every & > 0, 


"ES, -] 
(Ax, [x|47*x)| < [Alx xlxlzlxio, < Ml» 


(4.23) < g|Alxo x |x Ix [17 7! 1x 27 
2 
9-112 , d 2 
< e|x'|x|?/ In + 45A ls x lee. 


Next, observe that 37 , Ajm2(r) > 0 for all r € (0, 1). Hence, it follows 
by (4.21) and Young's inequality that there exists c2(q) > 0 depending only on q, 
such that, for every & > 0, 


N N 
(^ pon vit) < Ixits^ 0 Ai 
i=] i=1 
(4.24) < g 4 a x^ 1x [2/271 ye je aD Tr A 
< e|x'|x i771 » 
+ co (q) Tr AyA/(*2 p—G—2)/ (G42) (4 3c |x12). 


Collecting (4.22), (4.23) and (4.24), we conclude that there exists c; > 0 de- 
pending only on q, such that, for every & € (0, 1), 


qe ATE, (x), 1x] 72x) + 4 (Ax, Ix] £72x)] 


N 
2 PA EA 
tqq — nerij (ii E Yt) [x [x]? '£ 


izl 
< eq& M (Ui E + [holy 9 *P + ALK x + CTr Aye) y, (x) 
— q(g — 1 — (4k + q)e) Ve (x) |x" |x [2/271 D. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 693 
. This together with (4.20) and (4.19) implies (4.13). O 


LEMMA 4.7. Let $ be continuously differentiable in the second variable such 
that supie- g xC, E) € L1(0, 1) for all R > 0 and let $ satisfy (3). 

Then there exists a nonnegative function € — C(e) depending only on o, pi, 
|g0]1 and |g1|p, such that, for alle > O and x, y € Hi, 


8, (Fs, v(x) x $Iy + (elx I + CO) ly: 


If, moreover, © is twice continuously differentiable and there exist g5,g3 € 
L^ (0, 1), 2485 € L (0, 1) and a bounded  Borel-measurable function 
w:R,—> R4, @ (r) > 0 as r — oo, such that 








1/2 |x| 
aa ldxx (7, x)| € g2(7) + g3(7) Wr; (A) 
r € (0, D, x eR, 
and 


| 3/2 
duce) ta) gs) o(——), re(0,D,xeR 


x 
r(1—r) A/r(l— r) 


[which is the case, if à satisfies (®4)] then there exists a nonnegative function 
€ > C(e) depending only on w, pi, |gol1, |gilp, 18212; l23l2. lg4l1 and |gs|1 such 
that, for alle > O and x, y € Hj, 


9 Aya, (Fo, CA) ^ y)Qo x ily È + CIx' + Ce) yl. 








PROOF. As before, we set o (r, x) :— = l. Since $ is continuously differ- 


entiable in the second variable, (®3) implies that, for all x € R and r € (0, 1), 
(4.26) x (r, x) < gor) + Mle, x) |^ P o(o (r, x)). 


Fix x € Hj. Note that, for £, n € Hj, since Supig«g 6x (, E) € L (0, 1) for all 
R > 0, we have 





1 
3 (Fp, m) = [ ENA (r, x(r)) dr. 
Hence, (4.26) implies that, for y € Hj, 
1 
3 (Fp, y) (x) = I ODA, x(r)) dr 


< lylolgoli + 1yl55, p,- 1811p lo? / P! e o loo (x), 


where, for a, D > 0, we set 


lo*eP oo|so(x) :- sup |o? (r, x&))of (o (r, x(r)))]. 
r €(0,1) 


694 M. RÓCKNER AND Z. SOBOL 


Note that, for y € Hè, ly|2, <2|y’l2]yl2 and, hence, 


2(pi-1 ] 2pi—l 
IY Spier —D < Ly [P1 jy |n )//n < Aly" lay lg P1 fp 


Hence, by Young's inequality, there exists Cp, > 0 such that 


Qy (Fo, y)(x) < 1|y'[5 +e, Ly [lgolt + gi Gn 7D a?o- oc |. (x)]. 
Observe now that, for all € > 0, 


Co lgilzp/ Gri 7D Ca Š o |. X) « elo |2, (x) 4 C (&), 


with Ĉ(e) :— sup(óplgilgr ^" ^s2aóP/Ori-D(s)s > 0 such that 
Ep gil CI Dgapu/Opi D (s) > e). Now the first assertion follows from the 
inequality |o [oo (x) = sup, UA < A2|x'|2, x € Hy , which is a consequence of 
the fundamental theorem of calculus (or of Sobolev embedding). 

To prove the second assertion, let z := (—A)~!/*y, y € Hj. Then (C A)!7y = 
=z", |z']2 = |yla and |z"|2 = [y'|2. Moreover, 


1 
a-a (Fo, (CAT ^ y)x) = — Í z" (r)z(r)óx (r. x(r))dr 


] 
= | Iz P(e.(r x()) dr 
(4.27) Š 


1 
+ J z (r)z(r)x' (r)óxx (r, x (r)) dr 


l 
n I Z(r)z(r)bur(r, x) dr. 


We can estimate the first term in the right-hand side of (4.27) in the same way 
as above. Indeed, note that (4.25) was shown in the proof of Remark 2.1(v) to 
imply (4.26). So, as above, we obtain that there exists a nonnegative function 
€ +> Ci(£) depending only on o, p1, |2o|1 and |gi|p, such that, for all € > 0, 


1 
i Iz (r)bx(r, x(r)) dr < Glz" 15 + (elx'l + C1())Iz/ 
(4.28) 1)..7;)2 ^! 2 
< 1y'[5 + (elx' [5 + Ci) Lys. 


To estimate the second and the last terms in the right-hand side of (4.27), we note 
that 


Iz'loo < (2lz” lalz’l2)'/* = ly'lalyla)”7, Izle $ 27/7 |! 1n = 27" yp. 
By (4.25) and the estimate |o loo (x) < /2|x’|2, we conclude that, for all £ > 0, 
xx C, x)la < lgala + leala|o "^o o a [., Gx) 


1 
elo ll? (x) + Cale) < sel ^ + C2(e) 





< 6.2174 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 695 





and 
lbxr(- x)h < leal: + lgshilo? w o oos (x) 
< zelo REE) + Cale) < ce i + C906), 
with 
C2(e) := gala + sup Leslos o(s) = > 0 such that |g3|20(s) > cue | 
Cy) = gal + sup  leshis? ^o is = 0 such that [gshorG) > cm 


Thus, it follows from Young's inequality that there exists a nonnegative function 
€ > C(e) dependent on o, |22/2, |g3]2, lg4l1 and |gs|1 only such that, for all 
€ € (0, 1), 


i 1 
Í z (r)z(r)»x (mó)éxx(r, x(7)) dr + if z'(r)z(r)bxr(r,x(r)) dr 


1,1/2 Py n3/2 n3/2 
6 


< pi" I + Ix! Co) + delx’3 


< Hy'l + (elx'l5 + CO) Iy 15. 
Now the second assertion follows from (4.28). O 


elx l> + C3(e)] 


LEMMA 4.8. Let y satisfy (V). 
Then there exists a nonnegative function £ +> C (e) depending on w and C such 
that, for all € > Ô and x, y € Hi, 


dy(Gy, yx) < 1y'D + (elx/ + Ce) ly, 


(4.29) 
daiz (Gy. CATI) < Hy’ + Cel! B + Copy. 


PROOF. Fix x € Ae. Note that, for £, n € Hj, we have 
Og (Gy, )(x) = -f En Wx ox dr. 
Hence, for all y € Hj, 
8y(Gy, V(X) = —3 i Bus oxdr 
=} [ yx Yax oxdr < Bb ler ox 
Set z :— (—A)-V2y so that (—A)1/2y =: —z”. Then 


l l 
ay (Gy. CA)? y) (x) = |: ^J, o xdr < Lz lx o liss o xloo. 


696 M. ROCKNER AND Z. SOBOL 


PA E pee y 


Note that |y? < [yloolyle < 421y/l 
exists ĉ > 0 such that 


|y|,,. Hence, by Young's inequality, there 


1,4/3 
By (Gy, y)(x) < Hy + Aly Bix’ I^ o 0 x4, 
a x 3 
8—ayury (Gy, CA)? yx) x My + aly Bley Wax o x12. 


(V) implies that, for all ¢ > 0, IA (x) < elx} + C(e) with Cle) :— 
sup((C 4- r^ (ry) Ir > 0 such tat Cra} + w(r) > £^). Now the assertion 
follows from the estimate |x|o5 < zl xh. O 


LEMMA 4.9. Assume that sup g |O(-, x)| € L^(0, 1) for all R > 0. 

Then Fy: Hi —> L?(0, 1) is | - |-continuous on | - | Hj -balls. 

If, in addition, sup, |$ C, x)|] < co and $ is differentiable in the second 
variable with SUD|e|z R løs, £)| € L1(0, 1) for all R > 0, then, for all N € N, 
Py Fg o Py: EN — En is bounded and locally Lipschitz continuous. 

If @ satisfies (451), then, for all p € [2, oo), there exists cp,q,,4, > 0 such that 


IFslaG) X cp, o1glg 90927717096) — forallx e Hy, > 0. 


PROOF. Let (x7) nen be a bounded sequence in Hà and lim, ,oo Xn, =x € Hi 
in the | - |2-topology. Since a | -|1,2-bounded set is compact in Co(0, 1), we conclude 
that x, — x uniformly on (0, 1) and, hence, @(r, x,(r)) — @(r, x(r)) as n > oo 
for all r € (0, 1) and sup, ||(7, Xn (r)) < supy i1 lót, €) c L^(0, 1). Thus, 
the first assertion follows by the dominated convergence theorem. 

Let now the second assumption hold. Then, for all n € N, x, y € H L 


|(Fo(x), m)l < sup I$ C, £)ilmloo 


(Fa(x)— FO) m) | sup dx C. E| Inn lel — yloo- 


lE | xlx loo V1yloo 


Hence, the second assertion follows. 
To prove the last assertion, we first note that by (1), for all x € Hj, 


|Folo(x) < Igla |1 da Ix |? log gi -2) < lela (1+ |x |22) 





with s := 2412, and for p € [2, 00), 
2 
430 IM^ «72 S ejar < PE pg. 
0 


Since |x|? < Ixl[21x125?, it follows that 


(q2—242/s)/(p4-2) 
2 T 2 p+ 
Ix i£ < la 22/5 |x ga 2/s) « see (275 > I ix! Bx x| p] l 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 697 
Substituting s, we find 


gy Qa 244/00 (2) E 
x) A + lxh”) 


Fla) < gla, (2 
x [+ ix^ DL + fx lB PD, 
which implies the assertion. |] 
LEMMA 4.10. Gy: Hd > L7(0, 1) is continuous. 
If, in addition, V is bounded, then, for all N € N, PyGy o Py: En — En is 


bounded and locally Lipschitz continuous. 
If |W’ |(x) < CO + 1x12), then, for all x € Hå, x € (0, 00),.p € [2, o9), 


In particular, if v satisfies (Y), then 


O!/2rao/(p+2) (x). 


2N 3/(p+2) 
IGyl2(x) <2C (==) OPPAD G) ^ forallx e Hj. 


PROOF. Let (Xn) nen bea |- |1,2-bounded sequence such that lim; 455 Xn = x € 
Hj in the | - |2-topology. Since an | - |1,2-bounded set is compact in Co(0, 1), we 
conclude that x, — x uniformly on (0, 1) and, hence, y’ o xn — w’ o x uniformly 
on (0, 1). Thus, the first assertion follows by the definition of Gy. 

Let now v be bounded. Then, for all n e N, x, y € Hi : 


(Gy Œ), m)| x Wloolnh li 
(Gu) — Gy) n) x esssup [y (s)llm,lalx — vla. 


Is|xlxloo V1yloo 


Hence, the second assertion follows. 
The third assertion follows from the estimate |Gy|2(x) x C(1 + 1x [29 x"]o 


and (4.30). The last assertion is then clear, because we can take gg = 3 by Re- 
mark 2.16). C 


Now we are prepared for the following: 


PROOF OF PROPOSITION 4.1. Let N € N and let By denote the closed ball 
in Hj of radius N. By Lemmas 4.4 and 4.5, there exist By € (0, 1) such that 


1 
Su Fo — F o (x) e em 
JP N ( we le N 


for all B < By and By.) < By. Define 
(4.31) Fy :— F(oy)g, + Guy; N e N. 


698 M. RÓCKNER AND Z. SOBOL 


Then limy-+oo|F — Fy lz = 0 uniformly on balls in Hd, where F is as in (2.15), 
by Lemmas 4.3 and 4.4. Since by Lemmas 4.9 and 4.10, F is | - |2-continuous on 
| - |1,2-balls, and since Pyx — x in Hj as N — œœ for all x € Hè, it follows that 
(F2c) holds. 

By Lemmas 4.4 and 4.5, it follows that Lemma 4.6 applies to (®y)g, and Vy 
for all q € [2, oo) with «o, Ag, and mg. independent of N. So, (F2a) holds. 

By Lemmas 4.3 and 4.5, we see that Lemmas 4.7 and 4.8 apply to (®n)g, 
and Vy with the functions € — C; independent of N. So, (F2b) holds. 

Since in Lemma 4.9 we have (q2 — 1 + =) /(p + 2) <1 if and only if 
P>qa-3+ A (F2d) follows by Lemmas 4.9 and 4.10. 

The boundedness and local Lipschitz continuity of Fy follow by Lemmas 4.5, 
4.9 and 4.10. So, (F2) is proved. 

If, in addition, (4) holds, then (F2e) follows from Lemmas 4.7 and 4.8 in the 
same way as we have derived (F2b). LJ 


5. Some properties of the function spaces WC,,«, WiCp,, and Lip, , ,- 
Below for a topological vector space Y over IR let V’ denote its dual space. 

The following we formulate for general completely regular topological spaces 
and recall that our X = L?(0, 1) equipped with the weak topology is such a space. 

Let X be a completely regular topological space, V : X — [1, co] a function, 
and Xy :— (V « oo] equipped with the topology induced by X. Analogously 
to (2.2), we define 


Cy ims [r:xv — Rif l(v<r} is continuous V R € Ry and 
(5.1) 


lim sup V !|f| -o]. 
R—oo (V-R] 


equipped with the norm || f ||y :— sup V^! | f|. Obviously, Cy is a Banach space. 


THEOREM 5.1. Let X bea completely regular topological space. Let V : X — 
[1, co] be of metrizable compact level sets (V < R}, R > 0, and let Cy be as above. 
Then o (Cy) = B(Xy) and 


(5.2) Cy = [vb is a signed Borel measure on Xy, J V d|v| < oo}, 


vlc, = f V d|v|. In particular, f, — f weakly in Cy as n — œ if and only if 
(fn) is bounded in Cy, f € Cy, and fy — f pointwise on Xy as n — oo. 


PROOF. Letv be a signed Borel measure on X y such that f Vd|v| < oo. Then 
f> v(f) = f fdv is a linear functional on Cy and, since 


[tals f Evam iiv f vam. 








KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 699 


we conclude that v € Cy, and vile, < f V d|v|. 

Now let J € C}. Note that, for every f € Cy, there exists x € Xy such that 
l flv — 1f£1G9V-7(). Hence, we can apply [14], Corollary 36.5, to conclude that 
there exist positive 1), l2 € Cy such that / = 1, — lz and ||/ lc, = |j || Qo la | Ct 
So, we may assume that / > 0. Let fa € Cy, n € N, such that fa | 0 as n — oo. 
Then by Dini's theorem, f, > 0 as n — oo uniformly on all sets (V < R}, R 7 1. 
Hence, || fally — 0 as n — oo so I(f4) — 0. Cy is a Stone-lattice generating the 
Borel o-algebra on Xy. Indeed, we first note that Xy € B(X) as a o-compact 
set, and if B € B(Xy), then B=, B, with B, € 8(K,), Kn := {V xn). But 
since K, is a metric space, 8(K,) = o (C(K,)). But C(K,) = Cy x. by Tietze's 
extension theorem (which holds for compact sets in completely regular spaces). 
Hence, B(K,,) =a(Cy TOS, — a (Cy) N Kn. So, B € o (Cy). We conclude by the 
Daniell-Stone theorem (cf., e.g., [5], 39.4) that there exists a positive Borel mea- 
sure v on Xy such that 


| £4» in vf eCy. 


Since 1 € Cy, v is a finite measure. To calculate ||? le, , let fn ^ V be a sequence 
of bounded positive continuous functions on X y increasing to V. Such a sequence 
exists by [51], Lemma II.1.10, since Xy as a union of metrizable compacts is 
strongly Lindelöf. Then f, € Cy and || fally <1 for all n € N and 


lll, > | hdv> f v av as n — oo. 


Hence, [Ile = f V dv. The rest of the assertion follows from the dominated con- 
vergence theorem. L 


COROLLARY 5.2. Let X,Y be completely regular topological spaces. Let 
©:Y — [1, co] have metrizable compact level sets, and let X: V — [1, co] be 
a function. Let Xy and Yo, Cy and Ce be as above. Let M:Cg — Cy bea 
positive bounded linear operator. Then there exists a kernel m(x, dy) from Xy to 
Yo such that, for all f € Ce, Mf (x) = f f (yym(x, dy) and f ©(y)m(x, dy) x 
IM Ilco+cy V(x). 


COROLLARY 5.3. An algebra of bounded continuous functions on X y gener- 
ating B(Xy) is dense in Cy. 


PROOF. By a simple monotone class argument, it follows that the algebra 
forms a measure determining class on Xy. So by Theorem 5.1, it follows that 
the algebra is dense in Cy with respect to the weak topology, hence, also with 
respect to the strong topology since it is a linear space. [1] 


4 
i 


700 M. ROCKNER AND Z. SOBOL 


REMARK 5.4. In fact, on Xy there is a generalization of the full Stone- 
Weierstrass theorem and it can be deduced from the Daniell—Stone theorem, even 
in more general cases than considered here. In particular, the algebra in Theo- 
rem 5.3 generates B(X y) if it separates points. We refer to [47]. 


LEMMA 5.5. Let X be a completely regular space, let V, 9: X — [1,0] 
have metrizable compact level sets, V < cO for some c € (0, oo), and such that, 
for all R > 0, there exists R' > R such (V < R} is contained in the closure of the 
set (V < R') 1 Xe. 

Then Cy C Ce continuously and densely. 


PROOF. Note that Xo C Xy.If f € Cy, then, for R € (0, oo), 


ifl 
HE ( sue E: y 4 ARI flv, 


hence, 


Hu IF| 


sup —— <c Su crus rales . 

ee o = Es n y t zg v 
Letting R — oo, we conclude that f |y, € Co. Moreover, the last assumption 
implies that, if f € Cy vanishes on Xo, then it vanishes on (V < R} for every 
R > 0, since f is continuous on (V < R’}. Hence, the restriction to X@ is an 
injection Cy — Ce. Since V < O, the injection is continuous. The density follows 
from Corollary 5.3. Indeed, we have seen in its proof that o (Cy) = B(Xy). But 
then o(Cvty,) =a(Cy)N Xə D B(Xy)NXe = (Xo), since Xo e G(X). LI 


Now we come to our concrete situation. 


COROLLARY 5.6. For p € [2,00), p' > p, and x € (0,00), k' > k, we have 
WCp,. C WCpy, and Wp C W1Cp,« C WiCy x densely and continuously. 


PROOF. Note that, for x e L?(0,1), p > 1, Pyx e Hl, N EN, and Pax > x 
in LP(0, 1) as m — oo (see, e.g., [40], Section 2c16). Also by (2.8), Vy, o PN < 
a5 Vp and, hence, {Pyx|Vp n(x) < R, N EN} C {Vp < ap R} N Hi. Further- 
more, since 


A NDS 
(=) xP 2y B= x'a i 


d —112 
= [Ix n eA 5 [Ix enh 5 


E93 112 
< x! xxi? 7-15, 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 701 


it follows that there exists cp € (0, oo) such that 
(5.3) Op, € €CpO p.c. 


Now the assertion follows from Lemma 5.5. O 


LEMMA 5.7. Letl € Z4, p € [2, o0), K € (0, 00), (fi) nen C Lip, p, be such 
that f (x) :=limn-+o0 f(x) exists for all x € Xp. Then 


flip Sliminflfillp. and (f)1,p.« S liminf( a)i, p.e. 


In particular, (Lip; p> |l > llrip, pẹ) is complete. 


PROOF. The assertion follows from the fact that, for a set Q and y, : Q — R, 
n EN, we have sup, «o lim inf, «oo Ya (œ) x lim infn-+60 sup; cg Yn (Œœ). Ul 


PROPOSITION 5.8. Letlée Z}, p €[2, o0), and x € (0, oo). Let (fn)nen bea 
bounded sequence in Lip, p x- Then there exists a subsequence (.fn,)keN converg- 
ing pointwise to some f € Lip; px- 

Ifl > 0, then f is sequentially weakly continuous on Xp. 


PROOF. Let Y C X, be countable such that Y (Vp, < n} is |- |p-dense in 
(Vp, < n} for all n € N, and let (fng )ken be a subsequence converging pointwise 
on Y. Since fng, k € N, are bounded in Lip, p ,, they are |- |»-equicontinuous on 
the | - |p-open sets {Vp x < n} for all n € N. Hence, there exists a | - |p-continuous 
function f: X, — R such that fn, (x) — f(x) as k — oo for all x € Xy. By 
Lemma 5.7, we have f € Lip; nie 

Since fn,,k €N, are |(— A) 77 . |5-equicontinuous, f is |(~A)~’/* - |2-conti- 
nuous, in particular, sequentially weakly continuous on Xp. LJ 


6. Construction of resolvents and semigroups. In this section we construct 
the resolvent and semigroup in the spaces WC, , associated with the differential 
operator L defined in (1.2) with F satisfying (F2). 


PROPOSITION 6.1. Let F: Hè — X satisfying (F2a) and (F2c), and let xo, 
Qreg and A4, may for q € Qreg be as in (F2a). Assume that Ve F © c Wi Cq, 
for all k € N for some q € Qreg and k € (0, ko), 1 € [0, «). Then we have 


1 
(6.1) lula S — [Au — Lullign = Vue De X Z dgx- 


q,K 


For the the proof of this proposition, we need the following two results. 


LEMMA 6.2. Letq € [2, œ), x € (0, œ). 


702 M. RÓCKNER AND Z. SOBOL 


(i) V4, is Gâteaux differentiable on L4 (0, 1) with derivative given by 





(6.2) DV} (x) = Va (x) (2e ve Anar?) (e L4/&— VQ, 1)). 
q 


(ii) On Hj the function Vg. is twice Gâteaux differentiable. Moreover, 
DV, : Hi — Hi [C H^, see (2.1)], D^V, x : Hi — £(L7(0, 1)) [:— bounded 


linear operators on L*(0, 1)]. Furthermore, both maps are continuous and, for 
x,$,9€ Hj, 


oa) 

1 4- Ix | 

cn het) 
1+ |x|g 

(E, n|x|?~*) 
1+ |x 


E EP E 
(1+ xli»? 


(E, D? V, N) = V, e) (26, 3)*q 


x (20n, x)+q 
(6.3) 


+2 (€E, n) 4- 4(q — 1) 


. PROOF. Identities (6.2) and (6.3) follow from the formulas 


0 E 0 x = 

—lxl£ —4G.xixi* 7), — zzGlxI* 7,9) = (4 — D(IxI 7,85) and 

an 9E 

2 
3E ðN 


The continuity of DV, ,, and D? V, , in the mentioned topologies follows from the 
fact that, given x, — x in Hi as n — oo, then x) — x’ in L?(0, 1) and x, > x 
in Co[0, 1] as n — co. L] 





ix£—2q(q DG. nx ^, — x, & me Hg. 


LEMMA 6.3. Let q € [2, oo) and k € (0,00). Let u e WC, be such that 
u =u o Py for some N €N. Then there exists xo € (Co N Ch), 1) such that 


o dul 
lulla = E Qo). 





PROOF. We may assume u #0. Since V,l|u| is weakly upper semi- 


continuous on X and V,, has weakly compact level sets, there exists xo € X, 
such that |[ullgx = Iu] G0) V. (xo). Set x1 :— Pyxo and x2 := xo — x 1. Since 
u(xo) = u( Py xo), we conclude that 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 703 


Hence; by Lemma 6.2(i), we have that (DV3,« (x0), n) = 0 for all n € L1 (0, 1) N 
EL x. Since {ng|k € N} is a Schauder basis of L°(0, 1) for all s € (1, oo) (cf. [40], 
Section 2c16), it follows that DV, (x0) € Ey C (Co N C 1)(0, 1). 

Consider h € C!(R), h(s) := 2s + € ES reip ^, s € R. By (62), 
DV, (Xo) = Va, (xo)h o xo. Hence, h o xo € (CoN C1, 1). Since: for s € R, 
h'(s) — 2k +4 20-2 |s|4 77 > 2 > 0, the assertion follows, by the inverse function 


i+ixo Ig 
theorem. C] 


PROOF OF PROPOSITION 6.1. For N eN, we introduce a differential opera- 
tor L® on the space of all continuous functions v: Hi — R having continuous 
partial derivatives up to second order in all directions ng, k € N, defined by 


N 
L™ u(x) = 15 ^ Auüjv(x) + a (Œ, ng) + (FŒ), m))kvQ), — xe Ho. 
= rem 


Let A > Ag, u € Jy, U =u o Py for some N e N. Then, for m > N and 
x€ Hi, 
(. — Du = (à — L*?), = —V, L™ (uy. — 2(AmDVa,, DUVA) 
E UV g (A Ley, ,. 
since u € Dy, C WC,,, Lemma 6.3 implies that there exists xo € (Co N 
CiX(0, 1) such that [julg x = We (xo). We may assume, without loss of gener- 





ality, that u(xo) > 0. Then xg is a point, where the function u V achieves its 
maximum. Hence, 


D(uV, (xo) =0 and L™(uV7} (xo) <0. 
Therefore, 
(A — L)u(xo) > lulla c liminf(A — LU?) V9 xo). 
For m € N, let now L,, be as in (4.1). Note that 
|Lm(Va lEn) o Pm — L9? V, (x) — 0 — asm oo, x € Hd. 


This is so since A is of trace class, (F2c) holds and, for x € Hi Pax — x in Hi 
as m — oo and hence, by Lemma 6.23), DV, «(Pax) —> DV, (x) in Hi and 
D? V, (Pmx) —> D?V, (x) in £(L7(0, 1)) as m > oo. Hence, by (E22), 


A — L)u(xo) > |lullg,x liminfQ. — Lo )(Va,c | En) Pm xo) 


> ax \lUllg,x lim inf @g,«(Pmxo) = Mq, |lUllg,x@q,x (xo). 


Since, by assumption, V,, F9 € WiCgx, k € N, it follows that Lu € W1 C, ,. So, 
the assertion follows. O 


704 M. RÓCKNER AND Z. SOBOL 


Now we can prove our main existence result on resolvents and semigroups (see 
also Proposition 6.7 below). 


THEOREM 6.4. Let (A), (F2) hold, and let xo, Qreg be as in (F2a), k € (0, Ko) 
and p € Qreg be as in (F2d). Let k* € (k,ko), kı € (0,k* — x], and let 
Àp,c* and hy |, be as in Corollary 4.2, with k* and x1, respectively, replacing x. 

Then for à > Anne V Àz y> ((A — L), De) is one-to-one and has a dense 


range in WiCp, «+. Its inverse (À — L)! has a unique bounded linear extension 
G5 :Wi1C p, ce > WC s, defined by the following limit: 


AG,f:— lim 1G” f, f €Lipo rr,» f bounded, À > dpe V A e» 


weakly in W C y «* (hence, pointwise on X y), uniformly in X € [A,, oo) for all A, > 
Apt V Àz e Furthermore, 


! _ DG p 
lim AQ. — L)GY” f «Af 


weakly in W, Cp «* uniformly in X € [A*, 00). Gi, À > Apne V An pisa Markov- 
ian pseudo-resolvent on WiC y,» and a strongly continuous quasi-contractive re- 
solvent on WC y» with |G; WC, WC, eo «(A — Dee) G, is associated 
with a Markovian quasi-contractive Co-semigroup P, on WC,» satisfying 


| Pr l we, «WC, e = ernst t>0. 
For the proof of the theorem, we need the following lemma. 


LEMMA 6.5. Let G}, à > Ào, be a pseudo-resolvent on a Banach space F, 
such that ||AG)|\p+r < M for all à > Ao. Then the set Fg of strong continuity — 
of G, 

Fg :—(f €e EAG f f as à — oo], 


is the (weak) closure of G)F. 


PROOF. First observe that Fg is a closed linear subspace of F. Indeed, let 
f €E, fn €Fe,n € N, such that f, — f as n — oo. Then 


AG f — f =(AGa fn — fu) + AG, —id)(f — fn). 


The first term in the right-hand side vanishes as à — oo for all n € N and the 
second term vanishes as n — oo uniformly in A > Ao, since ||AG, — id|lr>r < 
M + 1. So, we conclude that AG; f > f as À — oo. 

By the resolvent identity, for f € F and A, u > Ag, we have 


À 
À — H 





1 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 705 


as A — oo since JAG, fly < MIf lle. Thus, GAE C Fg. On the other hand, 
Fg C GF by definition. Finally, since G,F is linear, its weak and strong closures 
coincide, by the Mazur theorem. C 


PROOF OF THEOREM 6.4. We have that (F2d) holds with «* replacing «, 
and for all k € N, that F® € W, Cy, by (F2c) and (F2d), so V, FO € W1 Cy. 
Therefore, Proposition 6.1 implies that (A — L): De, — WiCp,.<+ is one-to-one 
with bounded left inverse from WiCpx* D (A — L)(:04) to WC for all 
À > Xp, Kr. 

Now we prove that (À — L)(Dx,) is dense in W,Cp,,* for À > Ao x LetmeN, 


f € LiPo.2. x, (C Wi Cp), f bounded, and A > 25 ,.. By Corollary 4.2, Gy” f € 
Nexo Dete and, by (4.6), (A — L)GU? f(x) > f (x) as m — oo for all x € Hj, 


and by (4.5) and (F2d), 


|a. — LG” f(x) — (f o Paa) < O je X) Va (X) (20,2, 


À — ^5 x 
2 
ae "CCS (x)(f20,2, . 


Hence, |A(A — L)GU? f — Xf| > 0 as m > oo weakly in W1Cp,«*, uniformly in 
A € [Ax, 00) for all A, > Ay ,,, by Theorem 5.1. By Corollary 5.3, D(C Lipg > ,) 


is dense in W1 C, ,*. So, taking f € D and recalling that Ga f € Dy, by Corol- 
lary 4.2, we conclude that (A — L)(D,,) is of (weakly) dense range. Therefore, for 


i> Àz, ei V Apes the left inverse (A — L)~! can be extended to a bounded linear 


operator Ga: W1Cp,e* > WOCp,,». Then one has AGO", — AG, f as m — oo 
weakly in WC, (in particular, pointwise on Xp) for all À > 4, VAp,e* and all 
f € Lipo. 4 O 85(X). So, AG, is Markovian and à +> G, f is decreasing if f > 0 


for such À, since Ge f has the same properties. In addition, for v e WC’ 


p,k* 
v > 0 (cf. Theorem 5.1), and A > A* > ^5 ey V Àp,c*; 
| AG” f — AG, f| dv < J Gala- LG” f — Af |dv. 


Therefore, the weak convergence of aG f)men to AG) f in WC, œ» is, in 
fact, uniformly in A € [A*, oo). Furthermore, by (4.3), (A — Apu) Ga f llp,e* < 
lf llpc*, since Py — idx, strongly as N — oo by [40], Section 2c16. Because 
D is dense in WC, ,*, it follows that 


Gallwey WC, € A Apa) 


by continuity. Note that, for u € Dy,, A, M > A5 ,, V Apc, One has u — Gy (A — 
Lju = (u — A)Gyu since G, is the left inverse to (u — L). Hence, for f € 


706 M. RÓCENER AND Z. SOBOL 


(4. — L)(D,4), we have, by substituting u := Ga f, Gaf — Gy f =(u—-A)G Gy f, 
which is the resolvent identity. Since (A — L)(D,,) is dense in W1 Cy,» for 
À > Àz we conclude that Gj, À > A5, V Ap,* is a pseudo-resolvent on 
W1Cp,«*, quasi-contractive in WCp x. 

Now we are left to prove that G, is strongly continuous on WC p,x*. Then the 
last assertion will follow by the Hille- Yoshida theorem. Let f € D and let N € N 
be such that f = f o Py. Then, for all x e Xp, m > N, Az A. > À? kc V Apes 


IAGa f(x) — f x [AG f — AGI? f(x) + |AG” f(Pyx) — f (Pyx)]. 


As we have seen above, the first term in the right-hand side vanishes as 
m — oo uniformly in A € [A,, 00). The second term in the right-hand side van- 
ishes as À — oo for each m > N, by Corollary 4.2. Since (A — Ap..*)G) is quasi- 
contractive on WC, ,*, it follows that AG, f — f weakly in WC,» as A — oo, 
by Theorem 5.1. Hence, by Lemma 6.5, G, is strongly continuous on the clo- 
sure of D in WC,,.*. However, by Corollary 5.3, this closure is the whole 
space WC, +. Lj | 


REMARK 6.6. Since by (5.3) condition (F2d) holds with p' € [p, o0) N Qreg, 
x’ > &, if it holds with p € [2, oo), x e (0, 00), the above theorem (and, corre- 
spondingly, any of the results below) holds for any «* € (x, xo) and with p replaced 
by any p' e[p, oo) Oreg. We note that the corresponding resolvents, hence, also 
the semigroups, are consistent when applied to functions in D. In particular, the 
resolvents and semigroups of kernels constructed in the following proposition co- 
incide for any «* € (x, xo) and p' € [p, oo) N Qreg. 


Next we shall prove that both G, and P, in Theorem 6.4 above are given by 
kernels on Xp uniquely determined by L under a mild “growth condition.” 


PROPOSITION 6.7 (Existence of kernels). Consider the situation of Theo- 
rem 6.4, let à > Apc V A5 1 and t > 0, and let G} and P, be as constructed 
there. Then: 


(i) There exists a kernel g, (x, dy) from X , to Hi such that 
efe): | FOE dy) m GifG) — forall f e WiCo sx € Xp, 


which is extended by zero to a kernel from X y to X p. Furthermore, 4g,1 = 1, 8x, 
A > pt V Àz, «> İS @ resolvent of kernels and 





8Op œ x) € Vp * (x) for all x € Xp, 


M y c* 


with mp ,* as in (F2a). 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 707 
(ii) There exists a kernel p;(x, dy) from Xp to Xp such that 


pif (x)= J fo)pG.dy)- Bf(x) forall f € WCpx*, X € Xp. 


Furthermore, p,1 = 1 [i.e., pi (x, dy) is Markovian], pr, t > 0, is a noedsurablé 
semigroup and 


Pt Vp (x) £ g^», Vox Qo) for all x € Xp. 
(iii) We have 


on f(x) = e pf Gyr 


for all f € 8,(X,) U 8* (Xy), x € Xp. 


[We extend g, for all X. € (0, oo) using (6.4) as a definition.] 
(iv) Let x € Xp. Then 


(6.4) 


d pi, Xp \ HÐ dt =0. 
(v) For x € Xp, 
f pO p, (x) dt < oo, 
SO i 
[ pdfiG)dr «oo forall f € WiCpx: 
In particular, if u € Dy, , then t  p(|Lu])(x) is in L(0, t). Furthermore, 


(65)  p,u(x)—u(x)— [ pi(Lu)(x) dt for all u € Dy, x € Xp. 


PROOF. (1) and (ii) are immediate consequences of Theorem 6.4, Corollary 5.2 
and standard monotone class arguments. Equation (6.4) in (iii) holds by Theo- 
rem 6.4 for f € WCp,~*. Hence, (iii) follows by a monotone class argument, Now 
let us prove (iv). For all f e 8*(X p), by (iii), we have 


(6.6) [ «foods xe" | mdna) sex, 


Hence, (iv) follows with f := ly NHÀ since g(x, Xp \ HÈ) = 0 for all x € Xp. To 
prove (v), we just apply (6.6) to f := Op,» and the first two parts of the assertion 
follow by (i) and (iv). Now let u € Dy, (C W1C p,*). Recall that, by Theorem 6.4, 
hu — Lu € WiCpx*, hence, Lu € Wi Cp x*, so 


t 
| pr(jLu|)(x) dt < oo for all x € Xp. 


708 M. ROCKNER AND Z. SOBOL 
Finally, to prove (6.5), first note that, for u € Dea (C WCp,«), we have 
Gu € D(L), where L is the generator of P; on WC, ,«, and 
(6.7) L(G4u) = —u + AG3u = Gy (Lu), 
since G, is the left inverse of (A — L): Dy, — W1Cp,«*. Therefore, 
t t f P 
[ rayas - f eG. — f PEGun d 


(6.8) 
= P,G4u — Giu = pi(gau) — giu. 


But integrating by parts with respect to dt, we obtain, for all x € Xp, 


Í ‘peas 
= et [ e™ p (Lu)(x) dt —A [ e^ f e^ p,(Lu)(x) dt dr 
=e! EZE -[ ep (LuyG 4| 
t 


— À i pu |o (Lu)(x) — [. a pe (Lu)(x) 4| dr 
= e" gi(Lu)(x) — pi(ga (Lu))(x) 
t 
- (& — Dg.(Lu)() +4 Í pr(gy(Lu))(x) dr 


= p;u(x) — Api(gau)(x) — u(x) + Agu (x) 
+ Api(gxu)(x) — Agxu(x) 
= piu(x) — u(x), 


where in the second to last step we used (6.8) and that, by the second equality 
in (6.7), 


gi (Lu) = —u+Agyu. " 
Before we prove our uniqueness result, we need the following: 


LEMMA 6.8. Consider the situation of Theorem 6.4 and let X > dy ei V ^p: 
Then (X — L)(3D) is dense in W1C p xe. 


PROOF. Letu € De and N € N be such that u = u o Py. Choose g € C?? (IR) 
such that g’ < 0, 0 < ¢ < 1, g=1 on [0, 1] and 9 = 0 on (2, oo). For n EN, let 
2 
Qu(x) = pray, X € X, ug, :— nt. Then u, € D and 
Lu, = On, Lu + uLon + 2(Du, Ay Don). 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 709 
Note that, for i, j — 1,..., N, there are cj, cij € (0, oo) such that 


Ci 7 Cij 
(ð; Pnl < ~L] Pyxl<2n)» loj; Pnl S 42 LI Pwaro 2n] 


max c ; 


Then 0 < o, t 1 as n — oo, JAN Do,| € —,—, and [Lon(x)| € £(|x'lo + 
|PyFl2) x O p,k* (x) for all x € Hj and some c € (0, oo) independent of x and n 


by (F2c) and (F2d). So un — u and Lu, — Lu pointwise on Hj and bounded 
in W1C , x*. Hence, by Theorems 5.1 and 6.4, it follows that (A — L) (2D) is weakly, 
hence, strongly, dense in W1Cp,c«. Ul | 





PROPOSITION 6.9. | Consider the situation of Theorem 6.4 and let (p,),«.9 be 
as in Proposition 6.7. Let (qt)1>0 be a semigroup of kernels from Xp to Xp such 
that 


(6.9) Í i e^ q.Op(x)dr «oo for some À € (0,00) and all x € Xp, 
and 

(6.10)  gqiu(x) — u(x) = [ Gr (Lu)(x) dt for all x € Xp, uE D. 

[Note that the same arguments as in the proof of Proposition 6.7(iv) show that 


fl q«(x, Xp \ Hl) dt — 0, x € Xp, hence, the right-hand side of (6.10) is well- 
defined.] Then q;(x, dy) = pi(x, dy) for all x € Xy, t > 0. 


PROOF. Letu € 0,x € Xy, t — 0, and A as in (6.9). Integrating by parts with 
. respect to dt and then using (6.10), we obtain 


[ «aano cod: 
= [ ie" ra qc(Lu)(x) dt ds + gu [ gr(Lu)(x) dt 


= Í Ae" gs (u)(x) ds — J i Ae "u(x)ds + e™ (q: (Qu) (x) — u(x)), 


SO, 


J equa — hnas e Na: 


Since (6.9) Holds also with À' > À instead of A, we can let à 7 oo to obtain that the 
resolvent gts = =h e 759. ds, À > 0, of (qr):»0 is the left inverse of (A — L)| 5. 
Hence, g, and gy Sonne on (A — L)D which is dense in W)C p,,+. But by (6.9) 
and Theorem 5.1, gf (x, dy) € (W1C p, *)' [and so is gi (x, dy)] for all x € X p. 
Hence, = = ga. Since t +> qru (x) by (6.10) is continuous for all u € D, x € Xp, 


710 M. ROCKNER AND Z. SOBOL 


the assertion follows by the uniqueness of the Laplace transform and a monotone 
class argument. C 


Another consequence of Lemma 6.8 is the following characterization of the 
generator domain of the Co-semigroup P, on WC, +. The second part of the fol- 
lowing corollary will be crucial to prove the weak sample path continuity of the 
corresponding Markov process in the next section. 


COROLLARY 6.10. Consider the situation of Theorem 6.4. Let L denote the 
generator of P, as a Co-semigroup on WC p x. 


(i) Then v € WCp,«* belongs to Dom() if and only if there exist f € WC p,k* 
and (un) C D such that un — v and Lu, — f strongly, equivalently, weakly, 
in WiCp c» as n — oo, that is, uy — v and Lu, — f pointwise on Hi, and 
sup, (lln lli, p,c* + lLunll1,p,c*) < oo. In this case, Lv = f and uy, — v weakly 
in WCp «* as n — oo. 

(ii) df v € Dom(L) and v, Lv are bounded, then the sequence (us) C D 
from (i) can be chosen uniformly bounded. 

(ii) Let à > Ap, V Àz, T and v € D(L) such that v, Lv are bounded, and let 


x € Xy. Then there exists a Borel-measurable map D*, aV: Xp — X such that, 
for any sequence (un) C Dy, such that un — v, Lus > Lv weakly in W1C p,k* as 
n — oo with sup, llunlloo < 00, we have 


im a e. (DA — AV? Du, l2) (x) =. 
Furthermore, for all x € cC? (IR) and t > 0, 
pi(x o v)(x) — (x o v)(x) 
= f ' p:(x' ovLv)(x) dt + I pix" o v(DA ipv, DA15v))(x) dr. 


If, in particular, v = g, f for some f € D, then, in addition, for all x’ € (0, ki], 


] 
ID% i200) < = Lor a for gi (x, dy)-a.e. y € X y. 


PROOF. (i) Note that v e Dom(Z) if and only if v = Gg for some g € WC p. 
A> Apne V Ag - . Given such v, by Lemma 6.8, there exist u, € D, n € N, such 
that (A — L)u, > g in W1Cp,c* as n — oo. Then un = Ga (À — L)us, > Gag =v 
in WCp,«* by Theorem 6.4, consequently, 


Lun > àv — g =: f €WCpxs, 


as n — oo in WC, *, hence, by Corollary 5.6 in W1Cp,«*. On the other hand, let 
v, f € WC, x be such that, for some (un) C D, Un — v and Lu, — f weakly 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 711 


in Wi C p,k*- Then, for À > A5 6 V À p,k*: 
v= lim up = lim G4 (4 — Ljun = Galv — f), 


weakly in WC) x*, since, by Theorem 6.4, the latter equality holds as a weak limit 
in WC,» (hence, as a weak limit in W1 Cp,«x* by Corollary 5.6). 

(ii) By assumption, g :— Av — Lv is bounded. By Corollary 5.3, there exist 
gn € D, n € N, which we can choose such that sup, |gn| < llglloo, converging 
tog in WC) +. Let X > Apes V À5 a and consider vy :— GO gn, m € N. Then 
Un,m € Dy, by Corollary 4.2, and by Theorem 6.4, 


(6.11) um Un.m = GA gn weakly in WCp,«*, hence, weakly in W1C , x, 


and 

(6.12) „im (A — L)Un,m = 8n weakly in Wi Cp xe. 
Therefore, 

(6.13) „lim Lu, = —8n +4Gr8n > —8 + AGB = Lv 


weakly in W1C p, *, as n — oo. Since Agen is Markov, Vn,m, n, m c N, is uni- 
formly bounded. Consequently, the pair a. Lv) lies in the weak closure of the 
convex set 


(6.14) {(u, Lu)|u € Da, lallo < IIB loo} 


in W1C y, «* x W1Cy,«*, hence, also in its strong closure. Repeating the same argu- 
ments as in Lemma 6.8, it follows that, in (6.14), D,, can be replaced by D and 
assertion (ii) follows. 

(iii) If (un) C D is a sequence as in the assertion, then, since (un — Um)? € D, 


(A — L)(Un — Um)? + 21A ^ Dun — Um) |? = 2(un — um). — L)(Un — un). 
Hence, applying 23 (x, dy), we obtain 
(Un — Um)? (x) + 281 (IA? D (un — Um) P) (x) 
= 2g) ((Un — um)( — L)(us — Um)) (x). 


Hence, the first assertion follows by Theorem 6.4 and Proposition 6.7(1) by 
Lebesgue's dominated convergence theorem. Furthermore, 


t 
| pr(x, dy) dx < e^ gi (x, dy), 


k X (un) € D, and by (6.5), 


pi(X 0 un)(x) — (x o un)(x) 


Í H 
= Í px ousLus)(x) dt + Í pc" o u,|AV^ Du, [) (x) dt. 


712 M. RÓCKNER AND Z. SOBOL 


Hence, the second assertion again follows by dominated convergence, since 
Un > u weakly in WC,» as n — oo by the last assertion of (1). To prove the 
final part of (ii), define 

Un — GU f. nc N. 


Then by Theorem 6.4, (un) has all properties above so that (AV? Duy) ap- 
proximates D*,,,v in the above sense. But by (4.4), with q :— p, k :— «', and 
Lemma 3.6, 


1 
|Dun|(y) < xcx 02 VO) for all y € X (5 Xp). 
lia 


AV 


L1 


Next we want further regularity properties. We emphasize that these results will 
not be used in the next section. We extend both gi (x, dy), p;(x, dy) by zero to 
kernels from X, to X. 


PROPOSITION 6.11. Consider the situation of Theorem 6.4 and let E P 
be as in Proposition 6.7. Let q € Qreg N [2, p] and « € [k1,«*] with Ag,x, 
and X, , as in Corollary 4.2. Let X > Aq, V pict V M5 g 


Ak 


(i) Let f e WC. Then gx f uniquely extends to a continuous function on Xq, 
again denoted by gy f such that 


] 
(6.15) ga f lg, S X Apad low 


If f € Lipo. 1 85(X), then g, f extends uniquely to a continuous rn 
on X, again denoted by gx f. such that gy f € Lipo 2 ,, O B(X) and for à > Àg: 
satisfying (6.15) and 


1 
(6.16) (2 f )0,4, S cero aue 
A—A o 
If, in addition, (F2e) holds, then, for X > X7 , and f € Lip», O 85(X), 


1 
(6.17) afge S ias Pige 
q,K 


(ii) Let t > 0 and f € LiPo 2,1, 185 (X) N Wp O D). Then p; f uniquely 
extends to a continuous function on X, again denoted by pif, which is in 
Lipo 2,4, O B(X), such that 


(6.18) lp: f Mau < e^ f lou; 


(6.19) (Pt f )0,4,« «e Aa ( f )0,a e. 
If, in addition, (F2e) holds, then, for f € Lip, » ,, N By CX), 


(6.20) (pi Pia E €" (P) 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 713 


REMARK 6.12. (i) Because of Remark 6.6, the restriction q < p and 
k € [«1, «*] in the above proposition are irrelevant since, for given q € Qreg and 
x € (0, xo), we can always choose p, k1, «* suitably. 

(ii) If (F2e) holds, by similar techniques, as in the following proof of Propo- 
sition 6.11 and by the last part of Proposition 5.8, one can prove that p, from 
Proposition 6.7 can be extended to a semigroup of kernels from X to X such that 


lm pru(x) = u(x) for all u e :D,x e X. 
t— 


Then the proof of the first part of Theorem 7.1 in the next section implies the 
existence of a corresponding cadlag Markov process on X. However, we do not 
know whether this process solves our desired martingale problem, since it is not 
clear whether identity (6.5) holds for the above extended semigroup for all x € X. 
As is well known and will become clear in the proof of Theorem 7.1 below, (6.5) is 
crucial for the martingale problem. 

(iii) We emphasize that, in Proposition 6.11, it is not claimed that the exten- 
sions of g, and p, satisfy the resolvent equation, have the semigroup property 
respectively on the larger spaces X, or X. It is also far from being clear whether 
lim;-,0 pru(x) = u(x) for u € D and all x € X. Furthermore, it is also not clear 
whether 2g; f € WC, if f Ee WCg x. 


PROOF OF PROPOSITION 6.11. (i) Let f € Lipo? ,, N Bp(X). Hence, by 
(4.3) and (4.4) [together with (2.8)] applied with q = 2, x = «1, it follows 
by Proposition 5.8 that (G (m) f)mew has subsequences converging to functions 


in Lipo?,. Since we know by Theorem 6.4 that (iG f)men converges to the 
continuous function Gaf [= 2; f by Proposition 6.7(1)] on X, and since X, is 
dense in X, we conclude that all these limits must coincide. Hence, gj f has a 
continuous extension in Lip» ,,, which we denote by the same symbol. Since 
Py — idx, strongly on X, as N > oo, by (4.3), (4.4) and Lemma 5.7, we ob- 
tain (6.15) and, provided A > Àg, V Ag (6.16) for such f, since Lipo» ,, C 
Lipo , ,,- If, in addition, (F2e) holds, (4.7) and Lemma 5.7 imply (6.17), provided 
f € Lip; 2,4, 1 85(X) and à > LP . Considering (6.15) for f € D, since D is 
dense in WC, x, (6.15) extends to all of WCq, « by continuity. For f € WC,,,, the 
resulting function, lets call it g; f on Xq, is equal to g; f on Xp, since by The- 
orem 5.1, for u, € D, n EN, with un — f as n — oo in WC, it follows that 
Zaun (x) — 8a f (x) as n — oo for all x € Xp. So, g, f coincides with gif on Xp 
and g} f is the desired extension. Since X, is dense in X, C X continuously, it 
follows that, for f € WC; & N Lipo 2 4, O Bo (X), the two Constructed extensions 
of ga f coincide on X, by continuity. So, (i) is completely proved. 

(ii) First, we recall that, by Theorem 6.4 and Proposition 6.7(ii), since 
T € WCpx*. pf = Wont and 





_ (n " 
(6.21) p.f = Jim (7) f in WC, 


714 M. RÓCKNER AND Z. SOBOL 


in particular, pointwise on Xp. But by (i), for (large enough) n € N, (g,/;)" f have 
continuous extensions which belong to Lipo » ., (1 85 (X) and satisfy (6.15), (6.16) 
and, provided (F2e) holds, also (6.17) with X replaced by ae So, by Proposition 5.8, 
Lemma 5.7 and the same arguments as in the proof of (i), the assertion follows, 
since by Euler’s formula, for Ag > 0, 


naft N” 
im (2E) =e, 
n>oo\n/t — Ào a 


7. Solution of the martingale problem and of SPDE (1.1). This section is 
devoted to the proof of the following theorem which is more general than Theo- 
rem 2.3. 


THEOREM 7.1. Assume that (A), (F2) hold and let ko be as in (F2a), 
k € (0, ko) and p € Qreg as in (F2d). Let k* € (k, ko), k1 € (0, x* — K] and let 
Ap,* be as in Corollary 4.2 (with k* replacing « there). Let (pi)t>o be as in 
Proposition 6.7 (ii). 


(i) There exists a conservative strong Markov process M :— (82, F , (Fi)t>0, 
(xi)i»0. (Px) xe x,) on Xp with continuous sample paths in the weak topology 
whose transition semigroup is given by (p;);>0, that is, Ex f (xy) = prf (x), 
Xx € Xp, t > 0, for all f € SBy(Xp), where Ex denotes expectation with respect 
to Py. In particular, 


OO 
E; p eT PKS Op e Gs) ds| < oo for all x € Xp. 


(ii) (“Existence”) The assertion of Theorem 2.3(ii) holds for M. 

(iti) (“Uniqueness”) The assertion of Theorem 2.3(111) holds with k, Àx replaced 
by k*, Àp,«* respectively. 

(iv) If there exist p' € [p, oo), x’ € [«*, ko) such that 


(7.1) sup OF O)(F(), n)! «oo — ferallmeN, 
ye 


then M from assertion (i) weakly solves SPDE (1.1) for x € X y as initial condi- 


tion. 


REMARK 7.2. (i) Due to Theorem 7.1(iv), it suffices to show that (F1) im- 
plies (7.1) to prove Theorem 2.3(iv). It follows from (F1) that, for all m € N and 
y€ Hj, 

(EO), nm)| € I np)! + QE Q2. m,)| + EO), mm)! 
< A2n^m^ (ly + |V O)h *- I8 (211). 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 715 


Proceeding exactly as in the proof of Lemma 4.9, we find that, for all p’ € [2, oo), 
x’ € (0, oo) up to a constant (which is independent of y) which is dominated by 


2229 f 2(p/4-2 
ent a) +2) (y) + ee +2)) (y). 


Here we also used Remark 2.1(i). Note that (2(p’ + 2) ! x 4 and (q2 — 2 + 
=) IP +2) < 5 if and only if p’ > 2qg2 — 6 + 2 Hence, in the latter case, 
(7.1) holds and, therefore, Ml weakly solves (1.1), by Theorem 7.1(iv). 

(ii) Since by Remark 6.6 we can always increase p as long as it isin Qreg, which 
is equal to [2, oo) if (F1) holds, Theorem 7.1, in particular, implies that, for p > p, 
P € Qreg, Xp is an invariant subset for the process M and the sample paths are 
even weakly continuous in X 5. 


PROOF OF THEOREM 7.1. (i)and (ii): We mostly follow the lines of the proof 
of [7], Theorem 1.9.4. 
Let Qo :— X 5 09) equipped with the product Borel o -algebra M, x;(@) := w(t) 


for t > 0, œ € Q and, for t > 0, let M? be the o-algebra generated by the func- 
0 


tions x,, 0 < s x t. By Kolmogorov's theorem, for each x € Xp, there exists a 
probability measure Py on (Qo, M9) such that Mg := (Qo, M9, (.49),5o, (x2) 1>0, 
(Py)xex p) is a conservative time homogeneous Markov process with Py (x8 = 
x] = 1 and p; as (probability) transition semigroup. 

Now we show that, for all x € X p, the trajectory 3 is locally bounded PP, -a.s., 
that 1s, 


(7.2) P| sup lp <oo VT > ol =1 Vx € Xp. 
telo, TinQ 


Let g := V, +. Then by Proposition 6.7(iii), 
(7.3) e pet pig(x) < g(x) for all x € Xp, t > O. 


Hence, for all x € X,, the family e ^pa*t g(x?) is a super-martingale over 
(Qo, M9, MÌ, IP.) since, given 0 < s < t and Q € M?, by the Markov property, 


E (e ^»«*! e (xD), Q}= ep, [e ^p C= p, ape), Q} 
< E, (e ^r«** g(x$), Q). 
Then, by [7], Theorem 0.1.5(b) 
: 0 : 0 
P, [3 dim i?i and dim ixl vr 0| =1 Vx € Xp. 


In particular, (7.2) holds. 
Now we show that x? can be modified to become weakly cadlag on X p, that is, 
that 


: 0 : 0 
(7.4) p. [a uem and saat AU Vt ol =] Vx € Xy. 


716 M. RÓCKNER AND Z. SOBOL 


For a positive f € D and X > 0, we have e^ pg, f < gaf for all x € X, and 
t > 0. Hence, by the preceding argument, the family e^" g} f (x9) is a super- 
martingale over ($20, M9, MP, P,) for all x € X, and 


: 0 : 
P.da qun s fos) and dim, gf Gs) vt >of —1 Vx € Xp. 


By Proposition 6.7(1) and Theorem 6.4, we know that Ag, f — f as A — oo uni- 
formly on balls in Xp. Since (xP )reQ is locally bounded in Xp P,-a.s. for all 
x € Xp, we conclude that 


P.i3 ii i vro m v | 
;| ox dpud uu UEM I xEXp 


Now let f run through the countable set 
(7.5) D :— (cos(m, -) +1, sin(my, -) + 1k EN} C D, 


which separates the points in Xp. Then we get 
0 0 ^ 
d Vt z 0, =l y ; 
P. [3 dim £69) an dun fo) > 0 fed xEX, 


Now (7.4) follows from the fact that (xD) ag is locally in ¢ weakly relatively com- 
pact in X, P,-a.s. for all x € Xp. 
Let now 


(2 = faw- lim x? and w- lim x° Yt > of, 
Qastt Qas ft 


M :={Q N UIQ € M}, 
Mi :={ONQ'|OEM;}, t20, 


—— 


X; := W- or x9 t — 0. 


Then for all x € Xp and f € D,t>0, 
Ex[Lf G7) — f Gl] = im Ells Gp) — f G5] 
seQ 
= lim(pi f^ 6) — 2p Ps- f) + ps f^ Q2) 
seQ 


since by (6.5), t +> p, f(x) is continuous. Hence, IP, [x9 = X| = 1. Therefore, 
M := (Q, M, (Mare, @t):20, (Px)xex,) is a weakly cadlag Markov process 
with PP, (xo = x} = 1 and p; as transition semigroup. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 717 


Below, F, F; shall denote the usual completions of M, M+. Then it fol- 
lows from [7], Theorem 1.8.11. and Proposition 1.8.11, that M := (Q, F, (Ft)t>0, 
(xr)i-0, (PSOxex p) is a strong Markov cadlag process with P, {x9 = x) = 1 and p; 
as transition semigroup. 

To prove that M even has weakly continuous sample paths, we first need to 
show that it solves the martingale problem. So, fix x € X, and u € Dk. It follows 
by Proposition 6.7(v), that for all t > 0, 


(7.6) ILu|(x.) € L'(Q x [0, t], Py & ds). 


Furthermore, by (6.5) and the Markov property, it then follows in the standard way 
that, under IP, , 


t 
(7.7) u(x) — u(x) — | Lu(xs) ds, t = 0, 
0 
is an (F;):>09-martingale starting at 0. 
Now we show weak continuity of the sample paths. Fix x € X, and f € D. 


Let À > Apt V Ahe U :— gif. (€ D(L) C Wp) and u € Lipps. for all 
x’ € (0, oo). Then u and Lu are bounded, and trivially, 


[u (x) — u(xs)]* = [ut Gu) — u*(3)] — 4D? (x) — u? (xs) us) 
+ 6[u* (xp) — u*(xs)]u? (xs) — A[u (xt) — uc)? (xs). 


Since the martingale property is stable under L!(P,)-limits, by (7.7) and (the 
proof of) Corollary 6.10(iii), the following processes are right continuous mar- 
tingales under P, : 


t 
u(x;) — u (xo) — I Lu(x,) dt, 
t E x 
iG) wo) | Cuna) + [Dial Gs) dr, 
t S - 
iG) - Wao) | G Lus) + Gu Dal (ar) d, 


f _ m 
i^) — uo) — | Gu? LuyGs) + (DA ou re) dr, 
t > 0. Hence, we obtain, for t > s, 
E, [u (xi) — u(x;)]* 


=4E, | Tee uias e P die 


A" 
+ 6E, / ID* pul? Go us) — ux) de 


718 M. RÓCKNER AND Z. SOBOL 
. t 3/4 
<Al|Lulloo(t — s)!” (s. [wen esr dr) 
AY 


= | t 2/3 
6e" P (s, (D, uf 02) ^ (Es | lu) - u(x)? dr] | 


But by Corollary 6.10(iii) with x’ = 1/6, we have, for all y € Xp, 


" 6 
ga (D^ inul) = ( (490,24, /48%( Ver) O), 


1 
A — Me " 
and by the last part of Proposition 6.7(ii), 
£x (Vea) Œ) < gx (Vp) (x) 
S (A — Ape) Vpr (x). 


Therefore, for T € [1, 00), we can find a constant C > 0 independent of s,t € 
[0, T], £ > s, such that 


E,[u(xr) — u(x,)]* 
(7.8) 


1/4 
< cl — 5^ (x. J uen- ax) pas 35 yo, 


where, for s > 0 fixed, we set 


t 1/2 
(7.9) y(t) := (J E,[u(xr) — u (x,)]^ dt) i t € [s, T ]. 


Hence, we obtain from (7.8) that, for Br :— CT /^, 
y (t) xiBry/"()-$C(t—s), — te[s T] 
y(s) — 0. 

Hence, for ¢ > 0, t € (s, T], 


f E l 2C 1/6 
L m a (pes 
yO < Sy) + BR + (t — 9", 


so, multiplying by exp(— 7 (t — s)) and integrating, we obtain 
l 2 2C 7/6 | ,e(t—s)/4 
y(t) < (55 + d — $) y 8/4. 


Choosing € :— 4(t — s)~!, we arrive at 

y(t) < (B2T?$ +20) — s), — re[s, T]. 
Substituting according to (7.9) into (7.8), by the Kolmogorov—Chentsov criterion, 
we conclude that t F» u(x) is continuous (since by construction X; .— limgss4: x2). 
Now we take u € D1 := U,enngn(D) [cf. (7.5)]. Since D separates the points 


of X p, so does Dı. It follows that the weakly cadlag path t — x; is, in fact, weakly 
continuous in X p. 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 719 


(iii) Uniqueness is now an immediate consequence of Proposition 6.9. 

(iv) As in [2], Theorem 1, one derives that componentwise (x;);>0 under Py 
weakly solves the stochastic equation (1.1) for all starting points x € Xp. This 
follows from Levy’s characterization theorem (since (ng, -) € De, Vk € N) and by 
the fact that the quadratic variation of the weakly continuous martingale in (7.7) is 
equal to 


(7.10) [anu Du) (xs) ds, t>0. 


The latter can be shown by a little lengthy calculation, but it is well known in finite- 
dimensional situations, at least if the coefficients are bounded. For the convenience 
of the reader, we include a proof in our infinite-dimensional case in the Appendix 
(cf. Lemma A.1). Hence, assertion (iv) is completely proved. OU 


REMARK 7.3. In Theorem 7.1(iv) SPDE (1.1) is solved in the sense of The- 
orem 5.7 in [2], which means componentwise. To solve it in Xp, one needs, of 
course, to make assumptions on the decay of the eigenvalues of A to have that 
(V Awr)i»o takes values in X y. If this is the case, by the same method as in [2], 
one obtains a solution to the integrated version of (1.1) where the equality holds 
in X y (cf. [2], Theorem 6.6). 

APPENDIX 


LEMMA A.1. Consider the situation of Theorem 7.1(iv) and let u € D. As- 
sume, without loss of generality, that p' = p, k = k*. Let x € Xy, and define, for 
t 2:0, 


t 2 t 
M, := («eo — u(x) — f Lue) dr e [ (4) (x,) dr, 
where T (u) := (A Du, Du). Then (Mi)iso is an (F;)4>0-martingale under P,. 


PROOF. Let s € [0, 7). We note that, by (7.1), (Mi)i»o is a P,-square inte- 
grable martingale, so all integrals below are well defined. We have 


M,—M; 


t $ 
— (veo — u(xo) — Í Lu(xr)dr + u(xs) — u (xo) — Í Lu(x,) dr) 
f Hi 
x (wx — u(Xs) -f Lu(x,) dr) -2f r(u)(xr)dr 
5 t 
= (wx + u(xs) — 2u(xo) — 2 | Lu(x,) dr — f Lu(x,) dr) 


x (we) —u(x;) — [ Lu(x,) dr) — f DU (u)(x,)dr 


720 M. RÓCKNER AND Z. SOBOL 
= u^ (xy) — u^ (xs) — 2u(xo)(u(x;) — u(xs)) 


S 1—35 
-2(u6) — uCas)) | Lus) dr — (w(x) -u6)) [Lu rs) 


t—s —S8 
- (ve) ku) | Lue) dr 2009) f Lue dr 


i—s 


S f 5 2 
+ ij Lu(x,) dr ^ Lu(xy 45) dr + (f Lu(xr+s) dr) 
0 


— f P (u)(x,) dr. 
Now we apply E,[-|¥;] to this equality and get by the Markov property that P, -a.s. 
EX[M; — Ms|Fs] 
= pi-su^ (xs) — u^ (xs) — 2u (x) (Pr-su (xs) — u(xs)) 
— 2(py su (xs) — u(xs)) i ' Lu) dr 


t—5 


t—s | 
-2 f mu pis) Gg) dr 2) J p, (Lu)(xs) dr 
S i—35 
a Lu(x,) dr | pr(Lu)(xs) dr 


t—s pr’ t—s 
+2f [| Ba lLu(erLu(eyy)dr dr’ — f erra) dr. 


Since on the right-hand side the second and fifth, and also the third and sixth term 
add up to zero by Theorem 7.1 (ii), we obtain 


t—Ss 
E.M; — Ms) Fs] = pr-sU? (xs) — u?(x5) — 2 Í p (Lupi-s-ru)(xs) dr 
t—s pr’ 
+ 2 | Í pr (Lupy -, (Lu))(x;) dr’ dr 


t—s 2 t—s 
= f p (L(u2)) (x5) 4-2 f 5 GER d 
0 0 


Since on the right-hand side the first and fourth term add up to zero and the third 
is, by Fubini's theorem, equal to 


1—5 Í—s 
2 Í p. (Lu TE T, ar’) inus 
E 


t—5 
= 2 | Pr (Lu(pr—s—ru = u)) (Xs) dr, 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 721 


we see that 


EX[M, — M;15,] — 0, P,-a.s. L] 


Now we shall prove Theorem 2.4, even under the weaker condition (F2), but as- 
suming, 1n addition [to (F2c)], that 


(Al) — Jim (k, Fy) = F“ uniformly on H(-balls for all k € N, 
—O00 


which by Proposition 4.1 also holds under assumption (F1). So, we consider the 
situation of Theorem 7.1(i) and adopt the notation from there. First we need a 
lemma which is a modification of [9], Theorem 4.1. 


LEMMA A.2. Let E be a finite-dimensional linear space, A: E — £(E) be 
a Borel measurable function taking values in the set of symmetric nonnegative 
definite linear operators on E and B: E — E be a Borel measurable vector field. 
Let 


La Bu := Tr AD?u + (B, Du), | ueC(E). 


Let u be a probability measure on E such that L} gu — 0 in the sense that 
|Alz+e, [Ble € Li (E, p) and, for all u € C2(E), 


f Lasudyu = (). 


Let V : E — R4 be a C?-smooth function with compact level sets and ©: E — R} 
be a Borel measurable function. Assume that there exists Q € L'(E, u) such that 
La pV x Q — 0. Then O € LI (E, p) and 


f 9403 f Qan. 


PROOF. Let £:R, — Ri be a C?-smooth concave function such that 
E(r)=r for r € [0,1], £r) =2 for r > 3 and 0 < £ < 1. For Kk € N, "4 
Ex(r) :— k£(;). Then & is a C?-smooth function, &(r) = 2k for r > 3k, Ef < 
0 x £(r) t 1 and &(r) > r for all r > 0 as k — œ. Let uy PUPPI 2k | E 
x € Ey. Then uy € C2(E) and 


La pug(x) = Er oVLA BV + Er oV(DV, ADV) < E; oVLA BV 


since A is nonnegative definite and £/ < 0 
Now, since f L4 puydu=0,0<& oV <1, © > 0, and L4, pV < Q— 0, we 
obtain 


fa »Vedy x | $o V Qdp. 


Then the assertion follows from Fatou’s lemma. O 


722 M. RÓCKNER AND Z. SOBOL 


PROOF OF THEOREM 2.4 [only assuming (F2) instead of (F1)]. (i) It follows 
from (F2a) that, for 


C := Xsup(Vp,(x)lx € Hj, Ix'|o x 2. p ic* / Pope), 


which is finite since Hj -balls are compact on X p, 
in * 
(A.2) Ln Vout XC — 252 9 pet on Ey forall N EN. 


Let N e N. Obviously Vp,«(x) — oo as |xlo > œ, x € Ew. 
Since Op *(x) — oo as [x|2 — oo, x € Ey, we conclude from (A.2) that 
Ly Vp,es(x) > —œ as |x|2 — oo, x € Ey. Hence, a generalization of Hasmin- 
skii's theorem [8], Corollary 1.3, implies that there exists a probability measure uy 
on Ey such that Lý uy = 0, that is, f Lyu duy =0 for all u € C2(Ey). Below 
we shall consider jzy as a probability measure on X p by setting uy (X p V En) — 0. 
Then, by Lemma A.2, we conclude from (A.2) that 


(A.3) J On k* duN < C. 
X 


Since Op, x+ has compact level sets in Xp, the sequence (uy) is uniformly tight 
on Xp. So, it has a limit point 4 (in the weak topology of measures) which is a 
probability measure on X p. Passing to a subsequence if necessary, we may assume 
that uy — u weakly. Then (2.20) follows from (A.3) since ©p x+ is lower semi- 
continuous. In particular, (X p V H1) — 0. 

Now we prove (2.19). Let k € N. Then it follows by (F2c), (F2d) that FÊ :— 
(n, Fv) € W1C, e. In particular, FY € L! (uy) N L(y) for all N € N, due to 
(A.3) and (2.20). Also, the maps x > (x", ng) belong to L! (uy)  L! (qu) for all 
N EN since |(x”, n| < [nz loolxl2. Thus, it follows from the dominated conver- 
gence theorem that f Lyuduy =0 for all u € C?(En). Let u € D. Then we have 
Tr{Ay D2u(x)) = TH AD^u(x)] for large enough N. Since zy — u weakly, it 
follows that 


J Tr{Ay D2u} dy > J Te{AD?u}du. 
So, we are left to show that 


F( + x" my)àuuduy > | (FO +(x", ng)) 3zudu as N — co. 
N 


Since F € WiC p,c*, by Corollary 5.3, there exists a sequence Gk n € D such 
that || F9 — Gy ,|[1,p,«* — 0 as n — oo. Then 


Í FO du(duy — dy) 
X 


- f. Gi eu (dun — dio) + | (F — Grn) vidi — dp). 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 723 


Since uy — u weakly, we conclude that the first term vanishes as N — oo for all 
n € N. On the other hand, the second term vanishes as n — oo uniformly in N € N 
since, by (A.3), 


[IF - Gen\(dun do) 


« [F9 — Gu], per | Opal dio) 


< (c + | Op,c* du) FO — Granli pct 


Since x", Nk) = (x. Ng //). the same arguments as above can be applied to (x”, nx). 
Furthermore, by (F2c), (F2d) for R > 0, 


fir — F* |ayu|duw 


< | FO — FO aul dun 
[(O, x ER} l 


+2 sup @(Op~)lðkulloo Onc dAN. 


{Op «* zR) 


By (A.1) and (A.3) first letting N — oo and then R — oo, the left-hand side of the 
above inequality goes to zero. So, (2.19) follows and (i) is completely proved. 

(ii) Let u be as in (i), u € D, and À > Àp,x* V A» i . Then by Proposition 6.7(1) 
and Theorem 6.4, 


(AA) | ael — wees? | des | (4 — Lud, 


where we used (2.19) in the last step. By Lemma 6.8, (A — L)(D) is dense 
in W1 Cp,«* and by Proposition 6.7(i), for f € W1Cp.c, 





1 

galfi <s IS l1, pe* 21 pce = NF Ut, pact — 
and fl fldu < f Op, du || fllii, ps. So, by (2.20) and Lebesgue’s dominated 
convergence theorem, we conclude that (A.4) extends to any f € W1Cp.« replac- 
ing (A — L)u. Hence, by (6.4) and Fubini’s theorem, for every f € D C WiCp xt, 


oQ oO 
af eM | nfanat- f udu=a | e f udydt. 
0 


Since £ > pr f (x) is right continuous by (6.5) for all x € Xp and bounded, asser- 
tion (ii) follows by the uniqueness of the Laplace transform and a monotone class 
argument. C 


Voit 
* 


724 M. ROCKNER AND Z. SOBOL 


REMARK A.3. One can check that if u € D, u = 0 p-ae., then Lu = 0 p-a.e. 
(cf. [21], Lemma 3.1, where this is proved in a similar case). Hence, (L, D) can 
be considered as a linear operator on L^ (X, u), s € [1, oo), where we extend yu 
by zero to all of X. By [25], Appendix B, Lemma 1.8, (L, D) is dissipative 
on L*(X, u). Then by Lemma 6.8, we know that, for large enough A, (A — L)(®) 
is dense in W1 Cp,«* which, in turn, is dense in L! (X, u). Hence, the closure of 
(L, D) is maximal dissipative on L*(X, u), that is, strong uniqueness holds for 
(L, D) on L*(X, u) for s = 1. In case (F1+) holds or V = 0, similar arguments 
show that our results in Section 4 imply that this is true for all s € [1, oo) as well. 
A more refined analysis, however, gives that this is, in fact, true merely under 
condition (F2). Details will be contained in a forthcoming paper. This generalizes 
the main result in [16] which was proved there for s — 2 in the special situation 
when F satisfies (2.15) with U(x) = 5x, x € IR, and d =0, that is, in the case 
of the classical stochastic Burgers equation. For more details on the L!-theory for 
the Kolmogorov operators of stochastic generalized Burgers equations, we refer 
to [49]. 


Acknowledgments. The authors would like to thank Giuseppe Da Prato for 
valuable discussions. 

The results of this paper have been announced in [48] and presented in the 
"Seminar on Stochastic Analysis" in Bielefeld, as well as in invited talks at con- 
ferences in Vilnius, Beijing, Kyoto, Nagoya in June, August, September 2002, 
January 2003 respectively, and also both at the International Congress of Math- 
ematical Physics in Lisbon in July 2003 and the conference in Levico Terme on 
SPDE in January 2004. The authors would like to thank the respective organizers 
for these very stimulating scientific events and their warm hospitality. 


REFERENCES 


[1] ALBEVERIO, S. and FERRARIO, B. (2002). Uniqueness results for the generators of the two- 
dimensional Euler and Navier-Stokes flows. The case of Gaussian invariant measures. 
J. Funct. Anal. 193 77-93. MR1923629 

[2] ALBEVERIO, S. and RÓCKNER, M. (1991). Stochastic differential equations in infinite di- 
mensions: Solutions via Dirichlet forms. Probab. Theory Related Fields 89 347—386. 
MR1113223 

[3] BARBU, V. and DA PRATO, G. (2002). A phase field system perturbed by noise. Nonlinear 
Anal. 51 1087—1099. MR1926087 

[4] BARBU, V., DA PRATO, G. and DEBUSCHE, A. (2004). The Kolmogorov equation associated 
to the stochastic Navier-Stokes equations in 2D. Infin. Dimens. Anal. Quantum Probab. 
Relat. Top. 2 163-182. MR2066128 

[5] BAUER, H. (1974). Wahrscheinlichkeitstheorie und Grundzüge der Masstheorie 2. Erweiterte 
Auflage. de Gruyter, Berlin. MR514208 

[6] BERTINI, L., CANCRINI, N. and JONA-LASINIO, G. (1994). The stochastic Burgers equation. 
Comm. Math. Phys. 165 211—232. MR1301846 

[7] BLUMENTHAL, R. M. and GETOOR, R. K. (1968). Markov Processes and Potential Theory. 
Academic Press, New York. MR264757 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 725 


[8] BOGACHEV, V. and RÓCKNER, M. (2000). A generalization of Khasminski's theorem on the 
existence of invariant measures for locally integrable drifts. Theory Probab. Appl. 45 
363—378. MR1967783 
[9] BoGACHEV, V. and RÓCKNER, M. (2001). Elliptic equations for measures on infinite dimen- 

sional spaces and applications. Probab. Theory Related Fields 120 445—496. MR1853480 

[10] BRICK, P., FUNAKI, T. and WOYCZYNSKY, W., EDS. (1996). Nonlinear Stochastic PDEs: 
Hydrodynamic Limit and Burgers’ Turbulence. Springer, Berlin. MR1395889 

[11] BURGERS, J. M. (1984). A mathematical model illustrating the theory of turbulence. In Ad- 
vances in Applied Mathematics (R. von Mises and Th. von Karman, eds.) 1 171—199. 
MR27195 

[12] CARDON-WEBER, C. (1999). Large deviations for a Burgers’ type SPDE. Stochastic Process. 
Appl. 84 53—70. MR1720097 

[13] CERRAI, S. (2001). Second Order PDE's in Finite and Infinite dimensions: A Probabilistic 
Approach. Springer, Berlin. MR1840644 

[14] CHOQUET, G. (1969). Lectures on Analysis. I, II, III. Benjamin, Inc., New York. MR0250011 

[15] DA PRATO, G. (2001). Elliptic operators with unbounded coefficients: Construction of a max- 
imal dissipative extension. J. Evol. Equ. 1 1-18. MR1838318 

[16] DA PRATO, G. and DEBUSCHE, A. (2000). Maximal dissipativity of the Dirichlet operator 
corresponding to the Burgers equation. In Stochastic Processes, Physics and Geometry: 
New Interplays 1 85-98. Amer. Math. Soc., Providence, RI. MR1803380 

[17] DA PRATO, G. and DEBUSCHE, A. (2003). Ergodicity for the 3D stochastic Navier-Stokes 
equations. J. Math. Pures Appl. 82 877—947. MR2005200 

[18] DA PRATO, G., DEBUSSCHE, A. and TEMAM, R. (1995). Stochastic Burgers equation. In 

. Nonlinear Differential Equations Appl. 1 389—402. Birkhauser, Basel. MR1300149 

[19] DA PRATO, G. and GATAREK, D. (1995). Stochastic Burgers equation with correlated noise. 
Stochastics Stochastics Rep. 52 29-41. MR1380259 

[20] DA PRATO, G. and RÓCKNER, M. (2002). Singular dissipative stochastic equations in Hilbert 
spaces. Probab. Theory Related Fields 124 261—303. MR1936019 

[21] DA PRATO, G. and RÓCKNER, M. (2004). Invariant measures for a stochastic porous medium 
equation. In Stochastic Analysis and Related Topics (H. Kunita, Y. Takanashi and 
S. Watanabe, eds.) 13-29. Math. Soc. Japan, Tokyo. MR2083701 

[22] DA PRATO, G. and TUBARO, L. (2001). Some results about dissipativity of Kolmogorov op- 
erators. Czechoslovak Math. J. S1 685-699. MR1864036 

[23] DA PRATO, G. and VESPRI, V. (2002). Maximal LP regularity for elliptic equations with 
unbounded coefficients. Nonlinear Anal. 49 747—755. MR1894782 

[24] DA PRATO, G. and ZABCZYK, J. (1992). Stochastic Equations in Infinite Dimensions. Cam- 
bridge Univ. Press. MR1207136 

[25] EBERLE, A. (1999). Uniqueness and Non-Uniqueness of Semigroups Generated by Singular 
Diffusion Operators. Springer, Berlin. MR1734956 

[26] FLANDOLI, F. and GATAREK, D. (1995). Martingale and stationary solutions for stochastic 
Navier-Stokes equations. Probab. Theory Related Fields 102 367—391. MR1339739 

[27] FLANDOLI, F. and GOZZI, F. (1998). Kolmogorov equation associated to a stochastic Navier— 
Stokes equation. J. Funct. Anal. 160 312—336. MR1658680 

[28] GILBARG, D. and TRUDINGER, N. S. (1983). Elliptic Partial Differential Equations of Second 
Order, 2nd ed. Springer, Berlin. MR737190 

[29] GYÓNGY, I. (1998). Existence and uniqueness results for semilinear stochastic partial differ- 
ential equations. Stochastic Process. Appl. 73 271~299. MR1608641 

[30] GYÖNGY, I. and NUALART, D. (1999). On the stochastic Burgers equation in the real line. 
Ann. Prob. 277 782-802. MR1698967 

[31] GYÓNGY, I. and ROVIRA, C. (1999). On stochastic partial differential equations with polyno- 
mial nonlinearities. Stochastics Stochastics Rep. 67 123-146. MR1717799 


726 M. RÓCKNER AND Z. SOBOL 


[32] GYÖNGY, I. and ROVIRA, C. (2000). On L?-solutions of semilinear stochastic partial differ- 
ential equations. Stochastic Process. Appl. 90 83-108. MR1787126 

[33] HOPF, E. (1950). The partial differential equation uz + uux = [Luxx. Comm. Pure Appl. Math. 
3 201—230. MR47234 

[34] JARCHOW, H. (1981). Locally Convex Spaces. Teubner, Stuttgart. MR632257 

[35] KRYLOV, N. V. and RÓCKNER, M. (2005). Strong solutions for stochastic equations with 
singular time dependent drift. Probab. Theory Related Fields 131 154—196. MR2117951 

[36] KÜHNEMUND, F. and VAN NEERVEN, J. (2004). A Lie-Trotter product formula for Ornstein- 
Uhlenbeck semigroups in infinite dimensions. J. Evol. Equ. 4 53-73. MR2047306 

[37] LADYZHENSKAYA, O. A., SOLONNIKOV, N. A. and URAL’TSEVA, N. N. (1968). Lin- 
ear and Quasi-Linear Equations of Parabolic Type. Amer. Math. Soc., Providence, RI. 
MR0241821 

[38] LANJRI, X. and NUALART, D. (1999). The stochastic Burgers equation: Absolute continuity 
of the density. Stochastics Stochastics Rep. 66 273—292. MR1692868 

[39] LÉON, J. A., NUALART, D. and PETTERSSON, R. (2000). The stochastic Burgers equation: 
Finite moments and smoothness of the density. Infin. Dimens. Anal. Quantum Probab. 
Relat. Top. 3 363-385. MR1811248 

[40] LINDENSTRAUSS, J. and TZAFRIRI, L. (1979). Classical Banach Spaces 2. Function Spaces. 
Springer, Berlin, MR540367 

[41] Liu, T. P. and Yu, S. H. (1997). Propagation of stationary viscous Burgers shock under the 
effect of boundary. Arch. Rational Mech. Anal. 139 57-82. MR1475778 

[42] Lona, H. and SIMÃO, I. (2000). Kolmogorov equations in Hilbert spaces with application to 
essential self-adjointness of symmetric diffusion operators. Osaka J. Math. 37 185—202. 
MR1750276 

[43] Ma, Z.-M. and RÓCKNER, M. (1992). Introduction to the Theory of (Nonsymmetric) Dirichlet 
Forms. Springer, Berlin. MR1214375 

[44] MATSUMURA, M. and NISHIHARA, K. (1994). Asymptotic stability of travelling waves for 
scalar viscous conservation laws with non-convex nonlinearity. Comm. Math. Phys. 165 
83-96. MR1298944 

[45] NUALART, D. and VIENS, F. (2000). Evolution equation of a stochastic semigroup with white- 
noise drift. Ann. Probab. 28 36-73. MR1755997 

[46] RÓCKNER, M. (1999). LP-analysis of finite and infinite dimensional diffusion operators. Sto- 
chastic PDE's and Kolmogorov’s Equations in Infinite Dimensions. Lecture Notes in 
Math. 1718 65—116. Springer, Berlin. MR1731795 

[47] RÓCKNER, M. and SOBOL, Z. (2003). Markov solutions for martingale problems: Method of 
Lyapunov functions. Preprint. 

[48] RÓCKNER, M. and SOBOL, Z. (2004). A new approach to Kolmogorov equations in infinite 
dimensions and applications to stochastic generalized Burgers equations. C. R. Acad. Sci. 
Paris 338 945—949. MR2066356 

[49] RÓCKNER, M. and SOBOL, Z. (2004). L!-theory for the Kolmogorov operators of stochastic 
generalized Burgers equations. In Quantum Information and Complexity: Proceedings of 
the 2003 Meijo Winter School and Conference (T. Hida, K. Saitó and Si Si, eds.) 87—105. 
World Scientific, Singapore. 

[50] SADOVNICHAYA, I. V. (2000). The direct and the inverse Kolmogorov equation for the sto- 
chastic Schrödinger equation. Vestnik Moskov. Univ. Ser. I Mat. Mekh. 2000 15—20, 86. 
[Translation in Moscow Univ. Math. Bull. (2000) 55 15—19.] MR1843594 

[51] SCHWARTZ, L. (1973). Radon Measures on Arbitrary Topological Spaces and Cylindrical 
Measures. Oxford Univ. Press. MR426084 

[52] STANNAT, W. (1999). The theory of generalized Dirichlet forms and its applications in analysis 
and stochastics. Mem. Amer. Math. Soc. 142 1-101. MR1632609 


KOLMOGOROV EQUATIONS IN INFINITE DIMENSIONS 727 


[33] STANNAT, W. (1999). (Nonsymmetric) Dirichlet operators on L!: Existence, uniqueness 
and associated Markov processes. Ann. Sc. Norm. Super. Pisa Cl. Sci. (4) 28 99-140. 
MR1679079 

[54] STROOCK, D. W. and VARADHAN, S. R. S. (1979). Multidimensional Diffusion Processes. 
Springer, Berlin. MR532498 

[55] TRUMAN, A. and ZHAO, H. Z. (1996). On stochastic diffusion equations and stochastic Burg: 
ers’ equations. J. Math. Phys. 37 283—307. MR1370174 


DEPARTMENTS OF MATHEMATICS AND STATISTICS DEPARTMENT OF MATHEMATICS 
PURDUE UNIVERSITY UNIVERSITY OF WALES SWANSEA 
WEST LAFAYETTE, INDIANA 47906 SINGLETON PARK, SWANSEA 
USA SA2 8PP, WALES 

E-MAIL: roeckner@ math.purdue.edu UNITED KINGDOM 


E-MAIL: z.sobol @swansea.ac.uk 


The Annals of Probability 

2006, Vol. 34, No. 2, 728—742 

DOE: 10.1214/009] 17905000000684 

Q Institute of Mathematical Statistics, 2006 


A CHARACTERIZATION OF THE INFINITELY DIVISIBLE 
SQUARED GAUSSIAN PROCESSES 


BY NATHALIE EISENBAUM AND HAYA KASPI 
Université Paris VI-CNRS and Technion 


We show that, up to multiplication by constants, a Gaussian process has 
an infinitely divisible square if and only if its covariance is the Green function 
of a transient Markov process. 


1. Introduction. The question of the infinite divisibility of squared Gaussian 
vectors is an old problem which was first raised by Paul Lévy in 1948 [10]. Given a 
centered Gaussian vector ($1, ..., p), when can the vector ($2, jx $2) be writ- 
ten as a sum of n i.i.d. p vectors for every n € N? Many authors have worked on 
this problem. We refer the readers to [6, 8, 13, 14, 17] and the references therein for 
more on this problem. In 1984, Griffiths [9] established a characterization of the 
p-dimensional centered Gaussian vectors with an infinitely divisible square. His 
criterion is difficult to use since it requires the computation of the signs of all the 
cofactors of the covariance matrix. Indeed, except for the Brownian motion (and, 
more generally, Gaussian Markov processes), there were no examples, in the liter- 
ature, of processes satisfying this remarkable property nor examples of processes 
lacking it. 

We have recently shown [7] that the family of fractional Brownian motions 
provide examples of both kinds. A fractional Brownian motion is a real-valued 
centered Gaussian process with a covariance given by 


g(x,y) 2 |x| + |y — |x — ylf, 


where the index £ is in (0, 2). We proved that when £ is in (0, 1], then the square 
of this process is infinitely divisible, and when £ is in (1, 2), it is not. The critical 
index 1 corresponds to the Brownian motion. In both cases we have used the cri- 
terion of Griffiths. We have shown in [7] that if the Green function of a transient 
Markov process is symmetric, then it is the covariance of a Gaussian process with 
an infinitely divisible square. In particular, when the index f is in (0, 1], the co- 
variance of the corresponding fractional Brownian motion can be interpreted as the 
Green function of the symmetric stable process with index (f + 1), killed at its hit- 
ting time of 0. To treat the second case, we have shown directly that the condition 
of Griffiths is not satisfied. 


Received January 2004; revised March 2005. 
AMS 2000 subject classifications. 60E07, 60G15, 60325, 60355. 
Key words and phrases. Gaussian processes, infinite divisibility, Markov processes, local time. 


728 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 729 


In view of these examples, a natural question arises. Is the representation of 
the covariance function of a centered Gaussian process, as the Green function of a 
symmetric Markov process, a necessary condition for the infinite divisibility of its 
square? We show in this paper that, up to a multiplication by a constant function, 
the answer is affirmative. The result is presented in Section 3 in the form of a neces- 
sary and sufficient condition. The proof is based this time on a criterion for infinite 
divisibility established by Bapat [1], which we recall in Section 2. Although, at 
first sight, the verification of this criterion is as difficult as that of the equivalent 
criterion of Griffiths (one has to check here, too, the sign of each cofactor of the 
covariance matrix), it allows to significantly shorten the arguments. 

Gaussian processes with a covariance equal to a Green function play an im- 
portant role in the study of Markov processes. Indeed, the Isomorphism theorem 
of Dynkin [5] provides an indentity in law connecting each of these Gaussian 
processes to the local time process of the corresponding symmetric Markov 
process. This identity has been exploited to study properties of the local time 
process of the Markov process, using similar properties of the Gaussian processes. 
We choose to mention here only two references, but many more papers on the 
subject are quoted in [2], for example. In [11] Marcus and Rosen have studied 
sample path properties of the local time process using similar properties of the 
Gaussian process. In [2] Bass, Bisenbaum and Shi used the Isomorphism theorem 
to establish the transience of the most visited sites of a symmetric stable process. 
The question of characterizing the Gaussian processes with a covariance that 1s 
equal to a Green function of a symmetric Markov process has been open since the 
Isomorphism theorem of Dynkin was first proved. The importance of an answer to ` 
this question 1s twofold; it will give a powerful tool for the study of these Gaussian 
processes, as well as that of their associated Markov processes. The results of Sec- 
tion 3 provide an answer to this question. 

In Section 5 we show that the Brownian sheet does not have an infinitely di- 
visible square. Recalling that linear combinations of the squared components of 
centered Gaussian vectors are infinitely divisible, we find this example and that of 
the fractional Brownian motion, with index in (1, 2), somewhat counterintuitive. 


2. The criterion of Bapat. Let A = (Aj;)1<i,j<p bea p x p matrix. We write 
A > O0 if Aj; = 0 for all i,j. 


DEFINITION 2.1. A matrix A = (Ajj)1«i, j«p is said to be an M-matrix if we 
have the following: 


(i) Aij <0 for all i Æ j, 


(ii) A is nonsingular and A^! > 0. 


We refer the reader to [3] for the theory of M matrices. 


730 N. EISENBAUM AND H. KASPI 


DEFINITION 2.2. A diagonal matrix is called a signature matrix if all its en- 
tries are either 1 or —1. 


The Laplace transform, W (t1, ..., tp) of the square of a p-dimensional Gaussian 
vector is given by 


w(t) = [det + GT)] |^, 


where £ = (fj, ..., t5), G is a positive definite p x p-matrix, T is the diagonal 
matrix with Sames Tij = tj, and J is the p x p-identity matrix. 

Bapat [1] obtained the following characterization of the matrices G for which 
the Laplace transform y is infinitely divisible, that is, for which v (t) is a Laplace 
transform for any 4 > Q. 


THEOREM A. The Laplace transform is infinitely divisible if, and only if, 
there exists a signature matrix S such that SG^'S is an M-matrix. 


REMARK 2.3. It is elementary to check, using Theorem A, that a centered 
three-dimensional Gaussian vector (71, 72, 73) such that 


E(mn2)EQ»23)E(1311) < 0 


cannot have an infinitely divisible square. 


3. A necessary and sufficient condition for infinite divisibility. In a previ- 
ous work [7] on squared Gaussian. processes, we have established the following 
result. 


THEOREM B. Let g be the Green function of a strongly symmetric transient 
Borel right Markov process with state space (E, €). Let be a centered Gaussian 
process with covariance g. Then the process n? is infinitely divisible. 


The set E (by the assumption of a right process) is a Borel subset of a compact 
metric space and & is the o -field of its Borel sets. Note that if (n? (x), x € E) is 
an infinitely divisible squared Gaussian process, then, for any €-measurable real 
valued function d, the process (d? (x)? (x), x € E) is also infinitely divisible. We 
therefore have the following: 


COROLLARY 3.1. Let g be the Green function of a symmetric transient 
Markov process on a state space (E, €). Then for any &-measurable real valued 
function d on E, there exists a centered Gaussian process n with covariance equal 
to (d(x)g(x, y)d(y), (x, y) € E x E). The process n? is infinitely divisible. 


We shall first treat the case when E is a finite set. Theorem 3.2 completes Corol- 
lary 3.1, for this case, by showing that its sufficient condition on the covariance is 
also necessary for the infinite divisibility of 77. 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 731 


THEOREM 3.2. Let n bea centered Gaussian vector, indexed by a finite set E, 
with a positive definite covariance function (G(x, y), (x, y) € E x E). The vector 
"n? is infinitely divisible if, and only if, there exists a real valued function d on E 
such that, for any x, y € E, 


G(x, y) - d(x)g(x, y) d), 


where the function g is the Green function of a transient symmetric Markov 
process. 


REMARK 3.3. Note that the Green function g of a symmetric Markov 
process X is always positive definite. Indeed, it is semi-positive definite (see, e.g., 
[11] or [7]) and it has been shown in [7], Section II, that, for any x1, x2, ..., x, in 
the state space of X, the matrix (g(xi, xj))1«i, j«n is invertible. 


Let n be a centered Gaussian process indexed by an infinite set E. Then it has 
an infinitely divisible square if, for every finite subset F of E, the covariance of 
(ny, x € F) satisfies the condition of Theorem 3.2. This does not guarantee, a pri- 
ori, that, for some deterministic function d, the covariance of (d(x)ry, x € E) is 
the Green function of a transient symmetric Markov process. Restricting our atten- 
tion to E — R, and under the additional continuity assumption on the covariance 
function, the following theorem establishes the necessity of that condition. 


THEOREM 3.4. Let n be a centered Gaussian process, indexed by R, with 
a positive definite covariance function (G(x, y), (x, y) € R2). Assume that G is 
jointly continuous. If the process n° is infinitely divisible, then there exists a mea- 
surable real valued positive function d on R such that, for any x, y € R, 


G(x, y) =d(x)a(x, y) diy), 


where the function g is the Green function of a symmetric transient Markov 
process. 


Theorem 3.4 is proved in Section 4. Although Theorem 3.2 was a good hint to 
anticipate Theorem 3.4, we could not use it directly to prove it. Our argument is 
based on an explicit construction of the function d and of the Green function g of 
the Markov process that will be associated with the Gaussian process 7. As a first 
step of this construction, we show that the covariance G has to be positive. The 
proof of Theorem 3.4 remains valid, with simple changes, when the index set IR 
is replaced by a separable locally compact metric space. Moreover, we will see in 
Section 4 that, as a by-product of the proofs of Theorems 3.2 and 3.4, we obtain 
tne following characterization of the associated Gaussian processes, which makes 
use of the definition below. 


732 N. EISENBAUM AND H. KASPI 


DEFINITION 3.5. A p x p matrix A is said to be diagonally dominant if, for 
every i = 1,...,p, UT , Aij = 0. 


THEOREM 3.6. (i) A positive definite matrix G is the Green function of a finite 
state space transient Markov process, if, and only if, G7! is a diagonally dominant 
M-matrix. 

(ii) Let E be a separable locally compact metric space. Let (ny,x € E) be 
a Gaussian process with a continuous positive definite covariance (G(x, y), 
(x,y) € E^). Then G is the Green function of a transient Markov process 
on E, if and only if, for every x1, x2,...,xp in E, the inverse of the matrix 
(Gi, X;))1<i,j<p is a diagonally dominant M -matrix. 


4. Proofs. 


PROOF OF THEOREM 3.2. Let E be the finite set (x1,..., Xp}. Theorem A 
guarantees the existence of a signature matrix S with diagonal $;,i = 1,...,p 
such that the covariance matrix of (Sinx; i = 1,..., p) is positive. We shall there- 
fore assume, from the onset, that the covariance G is positive. 

We will actually prove that the covariance G satisfies 


D(y) 
D(x) 
where the function g is the Green function of a transient Markov process and D is 
a strictly positive function. 

This will be sufficient to establish Theorem 3.2. Indeed, if G satisfies (1), then 
we have 


(1) G(x,y)— 





g(x, y), 


D(x) 
g(x,y) = DO Do) (y). 


Let U be the corresponding potential with density g(x, y) with respect to a refer- 
ence measure u. That is, 


Uf (x) = | a(x, y) fO)u(dy). 


We then set m(dy) = u(dy)/ p? (y). With respect to m, the kernel U has densi- 
ties g(x, y) = D(x)G(x, y) D(y). Consequently, g is a symmetric Green function. 
Setting d = 1/D, one obtains the necessity of the condition of Theorem 3.2. 

To prove (1), let G be the covariance matrix of a p-dimensional centered 
Gaussian vector with an infinitely divisible square. By Theorem A, G^! is an 
M -matrix. This implies (see [3]) that 


(2) G^ -cl —B, 


where B > 0 and c is strictly greater than the absolute value of any eigenvalue 
of B. Further, by [3], Chapter 6, page 137 M36, since G`! is an M -matrix, there 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 733 


exists a diagonal matrix D such that Dj; > Q for all i and DG-iD-! has strictly 
positive row sum. This means that, for any i, 


p 
$ (DGD jj > 0. 
j=l 
According to Definition 3.5, DG~! D^! is diagonally dominant. Set T = D(iB) x 
D^. Then 
DG-1D^! = D(cl — B)D^! «c(I — T). 


Note that, for any i, 


P 
> Tij < 1. 
j=l 


The matrix 7 is therefore the transition matrix of a transient Markov chain 
(Xn)nen With state space E = {x1, x2, ..., Xp} satisfying Tj; = Px; (X1 = xj), and 
j Py (X1 = x;) = 1 — Pa (X1 = A), where A denotes a cemetery point. Let 


t denote the total number of visits of X to the state x;. The Green function of X 
is defined by ` 


g (xi, xj) = Ex, (£08). 
It can be computed as follows: 
OO OO 
Ex, (£53) = Ex, E T =f Ti=U- Ty, 
n=0 n=0 


which is defined and is finite, since, by (2) and the discussion following it, the 
spectral radius of T is strictly smaller than 1. Hence, we can write 





cDGD' = g; 
that is, for every x, y in E, 
D(y) 
cG(x, y) = Daye 


We shall now use the well-known method to turn a finite state space Markov chain 
into a continuous time Markov process by subordination to a Poisson process with 
rate c (see, e.g., [4]). By its construction, this Markov process is transient with 
potential density (Green function) equal to g and (1) is established. DO 


REMARK 4.1. Let Y be the Markov process with potential density g. Equa- 
tion (1) looks as if G is the potential density of an h-path transform of Y. This is 
really the case if D is excessive for Y. Unfortunately, this is not true in general. 


734 N. EISENBAUM AND H. KASPI 


Consequently, we see that the collection of covariance functions that correspond 
to Gaussian processes with an infinitely divisible square is somewhat richer than 
the set of Green functions of symmetric Markov processes. This remark remains 
true also when the index set E of the Gaussian process is infinite. 

To select covariance matrices that correspond to Green functions of Markov 
processes, we have the condition given by Theorem 3.6(1). Indeed, assume that 
G`! is a diagonally dominant M-matrix. Keeping the notation of the proof of 
Theorem 3.2, we can choose D = I and obtain T = 1B. For any i, 2e 1i; <1. 
Since the spectral radius of T is strictly smaller then 1, for at least one i, the 
above inequality must be strict. Therefore, T is the transition matrix of a transient 
Markov chain and we can conclude that G is the Green function of a transient 
Markov process with finite state space. 

To see that the condition is also necessary, consider the Green function g 
of a transient symmetric Markov process X with a finite state space E — 
{x1, x2, ..., xp). Then the inverse of the matrix G = (g(xi, xj))1«i,j«p is a diag- 
onally dominant M-matrix. Indeed Gjj = à; (I — Ty , where A; is the expected 
value of the exponential sojourn times in state j, T is the transition matrix of the 
Markov chain (X (S,,))nen, and 5, is the nth jump time of X. 


PROOF OF THEOREM 3.4. We first show that G has to be positive. For a 
fixed n, we define the function d, on R by 


k 
dux) = 7, HEX 
Let K be the compact set [a,b] with a < b, and let d,(K) = (d&(x):x € K}. 
The set d, (K) is finite set. Since the process (n4, (x) XE K) is infinitely divisible, 


thanks to Theorem A, there exists a signature function s, (1.e., a function taking 
values in (—1, +1}) on d, (K) such that, for every x, y € K, 


Sp (ds (x))sn (da (YVG (dn (x), da (y)) = 0. 
Note that 
(3) Sn(dn(x))8n(dn(y))G (dn (x), dn (y)) = |G (dn (x). ds y))]. 


Since G is continuous, we obtain, by letting n tend to co, that lim, (Sn (dn (x)) x 
Sp (da (y))) exists for all (x, y) for which G(x, y) Æ 0 and, for such (x, y), 


(tin sn (dn 0. (4,0 GG 9) 166. 9) 


For (x, y) in K 2 such that G(x, y) = 0, there exists a finite sequence 21,42, ..., a p 
such that G(x, a1)G (a1, a2) -- - G(ap.-1, ap) G(ap, y) Æ O. Indeed, for z € K, set 
C(z) = {u € R:G(z, u) Æ 0). For each z, C(z) is an open. set and LJ, eg C(z) 
is a covering of the compact set K. Thus, there exists a finite subcovering 
C (z1), C(z2), ..., C(2m) of the set K. The sets of the covering are not disjoint. 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 735 


Let C(z;,) be one of these sets. C(z;,) is a union of disjoint intervals. If C (zj,) 
does not cover K, then there exists z;, in (z1, Z2, ..., Zm} such that C (zi) cov- 
ers some of the endpoints of C(zi,) that are in K. If C(zi) U C(zi;) do not 
cover K, there exists Zi, in [z1, 22, ..., Zm} such that C(z;,) covers some of the 
endpoints of C(z;,) U C(zi;) that are in K, and we may continue on until we ex- 
haust all the finite covering above. Then we just have to make use of the sequence 
(Zi, Ziz» -.., Zim) tO construct (a1,02,..., ap) (p < m), connecting x to y such 
that G(x, a1)G(a1, a2) -- G(ap_1, ap) G(ap, y) £9. Since 


Sn (dn (x))s? (dn (a1)) 5p (dn (a2)) + ++ 5; (dn (a p))Sn (dn CY)) = sn (dn (x))5n (dn), 
we obtain, using (3), the existence of lim, Sn (dn (x))ss (da (y)) for all x, y. Set 
H (x, y) = lim sn (dn (x))sn (dn y). 
The function H is symmetric and, by (3), 
H (x, y) = sign(G(x, y)), 
and by its definition for all x, y, z € K, 
H(x, y) = HG, z) H(z, y). 


Hence, there exists a signature function Sx on K [take H (., zo) for a fixed zo in 
K] such that, for any x, y € K, 


Sk) SKOG, y) = |G, y). 


Denote by S, the function S;_,» ,}, then repeating the above argument and letting 
n tend to oo, we finally obtain the existence of S satisfying 


S(x)S(y)G(x, y) = 1G, y)l. 


If S is not identically equal to 1, then there exists a point x at which S(x) — 1 
and lim infy_,, S(y) = —1. By continuity, this means that G(x, x) has to be equal 
to 0 [because ,G(x, yn) < 0 for a sequence (Yn)n>0 that converges to x]. This is 
excluded since G is positive definite. Consequently, G is positive. 

We define the measure m on IR by 


m(dy) — (1 A e V dy. 


1 
7655) 


Note that the measure m is finite and f ./G(y, y)m(dy) < oo. Making use of the 


covariance inequality (G(x, y) < /G(x, x)G (y, y) for x,y in R), and the domi- 
nated convergence theorem, we see that the function 


(4) x(x) = | G(x, y)m(dy) 


is continuous. 


736 N. EISENBAUM AND H. KASPI 


We now consider the fixed compact set K = [a, b] with a « b. For any integer n, 
we keep the definition of d, introduced at the beginning of the proof. We set 


k 
In= [ke Rim? +1 <k m? — 1 and = ed 4. 


Define G, on the set (GE, X) :k, £e I) by 


k £ k 2 (£+1)/2" 
G FERR peas’ = pe Em + 
i D x) (> 7 is aay 


Since the process n? is infinitely divisible and the positive measure m has a support 
equal to R, G7? is an M-matrix (G, need not be symmetric). Hence, we can write 


Gr! =enl — By, 
where B, > 0 and c, is strictly greater than the absolute value of any eigenvalue 
of B,. 
Define 


k k £ 
Xn (>) "e Gn (=. =) 
ely 


and let D, be the diagonal matrix diag(xn (ar), k € In). Then, Dren = Gren, 
where e; is equal to (1, 1,..., 1). Consequently, for any Kk € Íp, 


si 
Yu»;c;DAu-(u()) »9. 


LE, 


Setting 7, = =D, ! B, D,, we have 
(5) D. 1G; D, = en — Th) 


and for every k, 5 ^;c;, T, (k, £) < 1. Consequently, the matrix T, is the transition 
matrix of a transient Markov chain. 
Rewriting (5), it follows that 


(6) Gn = Dn On (DE, 


with On = 1 — Th)". 

Let A be a square matrix of size |Z„| defined by A = (AG, Ael, We 
associate with A an operator on the set of the continuous functions with compact 
support. We denote this operator A and define it as follows. Let f be a continuous 
function on R with a compact support, then the function A f is given by 


AfQ)-2. A(4,c0. x) (a) Vx cR, 


teln 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 13] 


with the convention that A(4z, 47) = 0 when k is outside of In. 
That way we associate with D, (resp. D, l Gy, On) the operator Dn (resp. 
D-1, Gn, On). Note that we have, for every function f and every x in R, 


Daf (x) = Xn(dn(x)) f (du (x) 
(Dn)! f(x) = Gm (da )) | f (dax) 
£ £ (£4-1)/2" 
Gaf Y Cane). =) f(s) fp, mem. 
By (5), we know that the sequence (On)y>0 satisfies 
On = D;!GyDj. 


Moreover, for each n, Opn is the Green operator of a finite state space Markov 
process. 

Let O be the operator defined on Cx (the continuous functions with support 
in K) by 


Of(x) - D!$Df(x), | xekK, 


where | 
Df () = x« GO f E), 
D f()-Q0)  fG) 
and 
$56) — [ GG.» fo)map. 
with 


x)= | GG. yym(dy). 
Note that, by the continuity of G and the fact that G(x, x) > 0 and m has R as its 


support, x, is continuous, strictly positive on K and, thus, bounded below by a 
strictly positive constant. 


LEMMA 4.2. For every function f € Cx, the sequence (On f)n>0 converges 
to Of uniformly on K. 


PROOF. For any continuous function f on Cx and any x € K, we have 


1 
Onf) = —3 5 t G (ds (x), du(y)) Xn(du(y)) f (da (ym (dy) 


738 N. EISENBAUM AND H. KASPI 


and 


Of (x) = Í. Gx, )x, V) f(y). 


1l 
Xx (X) 
Note that 

Xn (ds (x)) = f. G(ds (x), d. (y))m (dy). 


Since G is uniformly continuous on compacts and m is finite, the sequence 
(Xn(dun(x))new converges uniformly on K to x,(x). Hence, the sequence 
G (di (x), dn(y)) xn(dn(Cy)) f (dh y)) converges uniformly on K x K to G(x, y) x 
XOS). Li | 


We would like to show that O is a Green operator. For this, we shall use the 
following lemma which, in essence, is due to Hunt (see [12], Chapter X, page 255, 
or [15]. 


LEMMA C. Let V be a kernel on a measurable space (E,&) such that V 
satisfies the complete maximum principal and the function V1 is bounded. There 
exists then a sub-Markovian resolvent (Vp) such that Vy = V. 


As Meyer has noted in [12], page 253, if we assume that V is continuous, then 
it is sufficient to verify the complete maximum principle for continuous functions 
with compact support only. More precisely, we have to verify that, for any a > 0, 
any two positive continuous functions with compact support, (f, h), if for all x 
in {h > 0} 

a T Vf(x) z Vh(x), 


then the inequality remains valid for all x in E. 
LEMMA 4.3. The kernel O satisfies the complete maximum principle on K. 


PROOF. First note that O maps continuous functions with support in K to 
continuous functions and is therefore a continuous kernel on K. Let f and h be in 
Ct and a > 0 and suppose that 


(7) a+ Of (x) > Oh(x) for all x € {h > O}. | 


Recall that {h > 0} is contained in K. We need to show that (7) is satisfied by 
all x € K. Suppose that this is not true. Then there exists a constant b > 0 and a 
point xo in K so that 


a+ Of (xo) < Oh(xo) — b. 
Let £ > 0 and N be such that, for any n > N, 
lO, f (x) — Of (x)| «e Vxce€kK 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 739 


and 
lOnh(x) — Oh(x)| < € VxcK. 


Such N exists by Lemma 4.2 and the fact that f, h € Gz. 
By (7), for every n > N and x in {h > 0}, 


(8) a -- 26 + Op f (x) > Ogh(x). 


Recall now that, for each n, Op is a potential of a Markov process on Jy. It 
follows from [16] that O, satisfies the complete maximum principle on 7, and, by 
its definition, O, satisfies the complete maximum principle on K. Consequently, 
(8) is valid for all x in K. In particular, we have 


a 4- 2€ + On f (xo) = Onk (xo). 
On the other hand, we have 
à — 2e + Onf (xo) < Onk (xo) — b. 


Choosing € < 5/4 leads to the desired contradiction. H 


In order to simplify the rest of the proof of Theorem 3.4, we use now, instead 
of O, the operator Vx defined by 


1 
Vo J G(x, y) f Gom(dy). 
XxX) JK 


First we note that Vx 1 = 1. Further, since Vg f = O(f/x,), Vx satisfies the 
complete maximum principle on K, by Lemma 4.3. 

Let K, be the compact set [—n, n] for n € N, and denote by V, the correspond- 
ing operator as defined above for K. Further, arguing as for Xp, one can show that 
X defined in (4) is strictly positive. 


LEMMA 4.4. Let V be the operator defined on bounded Borel functions by 


1 
(9) VfG) e —— | GG») foy. 
x(x) JR 
Then there exists a sub-Markovian resolvent (V5) such that Vo = V. 


PROOF. Note that V1 = 1, and that V maps continuous functions with com- 
pact support to continuous functions on R. Let f and A be two positive continuous 
functions with compact supports and a > 0 and suppose that 


(10) a Vf(x) zVh(x) for all x € {h > 0); 


we need to show that this is satisfied for all x in R. 


740 N. EISENBAUM AND H. KASPI 


We denote by H a compact set that contains both the compact supports of f 
and of h. There exists no such that, for n > no, H is contained in K,. Hence, (10) 
can be written as 


,20. 
EME x) 


We have shown, when defining m, that x is a continuous function. Moreover, for 
any n € N, the function x,, is continuous and the sequence (x, ) is increasing 
and converges simply to x. Consequently, by Dini's theorem, (x x, ) converges uni- 
formly to x and x (x) is strictly positive. Since the sequence (Xx, )n>ng is bounded 
below by a strictly positive constant on H, x/x,, converges to 1 uniformly on H. 
Hence, for every ¢ > O, there exists N such that, for every n > N, 


——— VS f(x) > Vg, A(x) — forall x € {h > 0). 


a(lte)t+V,y, f) E Vg h(x) — forall x e {h > 0). 


Since the operator V. satisfies the complete maximum principle on K,, the above 
inequality is still true for x in Kn. We now multiply each side of the inequality by 
Xx, (X)/ X (x) to obtain 


all + €)Xx, G0/xG) + Vf (x) = Vh(x) forallx e K,. 
Since x. (x)/x(x) < 1, we finally get 
a(lite)+Vf(x) => VAG) forall x € Ky, 
and letting n tend to oo, 
a(l4-2) - Vf (x) > Vh(x) for all x € R. 

Since this is true for any € > 0, (10) is established for all x in R. O 

The sub-Markovian resolvent (V,) allows one to construct a semi-group (P;) 
with (V) as its resolvent. It will be a Ray process if we restrict our state space to a 
compact set, and, in general, we may need to apply a Ray Knight compactification 
in order to obtain a good semigroup. With this semigroup one can construct a 
transient Markov process on E = R with 0-potential equal to V. On the other hand, 


(9) implies that m is a reference measure for the Markov process with potential V 
and that the potential density h(x, y) of V with respect to m is equal to 


h(x, y) = MT Gt, jy). 


Setting then u (dy) = x (y)m(dy), we see that the operator V admits the symmetric 
densities (wee, i» x,y € R) with respect toy. L1] 


INFINITELY DIVISIBLE SQUARED GAUSSIAN 741 


REMARK 4.5. Assume that G is such that, for any p, the inverse of the co- 
variance matrix of (fx, 71x. -+ -> xp) ISa diagonally dominant M-matrix. One can 
then follow the steps of the proof of Theorem 3.4, without the need to define the 
matrices D, (we check similarly, as in Remark 4.1, that the matrix Ta = tB, is 
a transition matrix) and conclude that there exists a symmetric Markov process 
with potential densities equal to (G(x, y), (x, y) € R2). This, together with Theo- 
rem 3.6(1), leads to Theorem 3.661). 

Finally, every centered Gaussian process with infinitely divisible square is equal 
to a deterministic function times a Gaussian process that is associated with a 
Markov process. The isomorphism theorem of Dynkin can hence be easily adapted 
to incorporate the deterministic functions, so that it can be used in studying any 
Gaussian processes with infinitely divisible square. 


5. The case of the Brownian sheet. A Brownian sheet is a centered Gaussian 
process (Wy s, x > 0, s > 0) with a covariance given by 


IE(W, s Wy,t) = (x Ay)(sA t). 


Remember that, up to a multiplicative constant, (x ^ y; x, y € R4) is the Green 
function of the linear Brownian motion killed at its first hitting time of 0. Hence, 
this covariance is the product of two Green functions and one may ask whether 
(Wwe o X = 0,5 > 0) is infinitely divisible. The answer is given in the. following 
proposition. 


PROPOSITION 5.1. (i) For every (xi, si)i<i<3 of RÉ . the vector (We s 1< 
i < 3) is infinitely divisible. 
(11) The process (We 5X > 0,5 > 0) is not infinitely divisible. 

PROOF. Using Griffiths criterion (or Bapat’s criterion), it is easy to check 
that a sufficient condition for the infinite divisibility of the square of a three- 
dimensional Gaussian vector, indexed by {1, 2,3} and with a covariance g, 1s 


(11) gi, J)g(k, k) = eG, keV, k), 

for any choice of i, j, k in (1,2, 3}. 

. Since any Green function satisfies (11), so does the covariance function of the 
Brownian sheet, as a product of two Green functions. 

By Bapat’s criterion, we know that a covariance matrix G, such that G > 0, 
corresponds to an infinitely divisible square Gaussian vector if G^! is an 
M-matrix. We choose (x;,5;)i<i<4 such that 0 < xj < x3 < xo < x4 and 
0 < s4 < $1 < 53 < 52. Let G be the matrix ((xj Axj)(si ^sj))1«i,jz4. We compute 
the coefficient Gy: 


EN 
G5 = x1X354(x2 — X3)(53 — 51)54 > 0. 


Hence, G^! is not an M-matrix and the corresponding Gaussian vector does not 
have an infinitely divisible square. (J 


742 N. EISENBAUM AND H. KASPI 


Acknowledgments. We would like to thank Gennady Samorodnitsky and Jay 
Rosen for helpful discussions and comments which helped us to improve a previ- 
ous version of this work. 


REFERENCES 


[1] BAPAT, R. B. (1989). Infinite divisibility of multivariate gamma distribution and M matrices. 
Sankhyà Ser. A 51 73—78. MR1065560 
[2] Bass, R., EISENBAUM, N. and SHI, Z. (2000). The most visited sites of symmetric stable 
processes. Probab. Theory Related Fields 116 391—404. MR1749281 
[3] BERMAN, A. and PLEMMONS, R. J. (1979). Nonnegative Matrices in the Mathematical Sci- 
ences. Academic Press, New York. MR0544666 
[4] CINLAR, E. (1975). Introduction to Stochastic Processes. Prentice—Hall, Englewood Cliffs, NJ. 
MR0380912 
[5] DYNKIN, E. B. (1983). Local times and quantum fields. In Seminar on Stochastic Processes 
82 69—84. Birkháuser, Boston. MR0902412 
[6] EVANS, S. N. (1991). Association and infinite divisibility for the Wishart distribution and its 
diagonal marginals. J. Multivariate Anal. 36 199—203. MR1096666 
EISENBAUM, N. (2003). On the infinite divisibility of squared Gaussian processes. Probab. 
Theory Related Fields 125 381—392. MR1964459 
[8] GRIFFITHS, R. C. (1970). Infinite divisible multivariate gamma distributions. Sankhyd Ser. A 
32 393-404. MR0295406 
[9] GRIFFITHS, R. C. (1984). Characterization of infinitely divisible multivariate gamma distribu- 
tions. J. Multivariate Anal. 15 12-20. MR0755813 
[10] Lévy, P. (1948). The arithmetical character of the Wishart distribution. Proc. Cambridge Phi- 
los. Soc. 44 295—297. MR0026261 
[11] MARCUS, M. B. and ROSEN, J. (1992). Sample path properties of the local times of strongly 
symmetric Markov processes via Gaussian processes. Ann. Probab. 20 1603-1684. 
MR1188037 
[12] MEYER, P. A. (1966). Probabilités et Potentiel. Hermann, Paris. MR0205287 
[13] MORAN, P. A. P. and VERE-JONES, D. (1969). The infinite divisibility of multi-gamma dis- 
tributions. Sankhyd Ser. A 31 191—194. 
[14] PARANJAPE, S. R. (1978). Simpler proofs for the infinite divisibility of multivariate gamma 
distributions. Sankhyd Ser. A 40 393—398. MR0589292 
[15] TAYLOR, J. C. (1972). On the existence of sub-Markovian resolvents. Invent. Math. 17 85-93. 
MR0345222 
[16] TAYLOR, J. C. (1975). A characterization of the kernel lim,,9 V) for sub-Markovian resol- 
vents (Vi). Ann. Probab. 3355-357. MR0373025 
[17] VERE-JONES, D. (1967). The infinite divisibility of a bivariate gamma distribution. Sankhyà 
Ser. A 29 421-422. MR0226704 


[7 


— 


LABORATOIRE DE PROBABILITES INDUSTRIAL ENGINEERING 
UNIVERSITE PARIS VI-CNRS AND MANAGEMENT 

4 PLACE JUSSIEU TECHNION 

75252 PARIS CEDEX 05 HAIFA 

FRANCE ISRAEL 32000 


E-MAIL: nae ccr.jussieu.fr E-MAIL: iehaya@tx.technion.ac.il 


The Annals of Probability 

2006, Vol. 34, No. 2, 743-778 

DOE: 10.1214/009117905000000648 

© Institute of Mathematical Statistics, 2006 


THE SHANNON INFORMATION OF FILTRATIONS AND THE 
ADDITIONAL LOGARITHMIC UTILITY OF INSIDERS 


By STEFAN ANKIRCHNER, STEFFEN DEREICH AND PETER IMKELLER 


Humboldt-Universitat zu Berlin, Technische Universität Berlin 
and Humboldt-Universitdt zu Berlin 


The background for the general mathematical link between utility and in- 
formation theory investigated in this paper is a simple financial market model 
with two kinds of small traders: less informed traders and insiders, whose 
extra information is represented by an enlargement of the other agents' fil- 
tration. The expected logarithmic utility increment, that is, the difference of 
the insider's and the less informed trader's expected logarithmic utility is de- 
scribed in terms of the information drift, that is, the drift one has to eliminate 
in order to perceive the price dynamics as a martingale from the insider's 
perspective. On the one hand, we describe the information drift in a very gen- 
eral setting by natural quantities expressing the probabilistic better informed 
view of the world. This, on the other hand, allows us to identify the addi- 
tional utility by entropy related quantities known from information theory. 
In particular, in a complete market in which the insider has some fixed ad- 
ditional information during the entire trading interval, its utility increment 
can be represented by the Shannon information of his extra knowledge. For 
general markets, and in some particular examples, we provide estimates of 
maximal utility by information inequalities. 


0. Introduction. A simple mathematical model of two agents on a financial 
market taking their portfolio decisions on the basis of different information hori- 
zons has attracted much attention in recent years. Both agents are small, and un- 
able to influence the price dynamics of the risky assets constituting the market. One 
agent just acts on the basis of the evolution of the market, the other one, the insider, 
possesses some additional knowledge at every instant of the continuous trading in- 
terval. This basic fact is modeled by associating two different filtrations with each 
agent, from which they make their portfolio decisions: the less informed agent, at 
time f, just has the o -field ¥;, corresponding to the natural evolution of the market 
up to this time, at his disposal for deciding about future investments, while the in- 
sider is able to make better decisions, taking his knowledge from a bigger o -field 
9; D F;. We give a short selection from among the many papers dealing with this 


Received July 2004; revised February 2005. 

AMS 2000 subject classifications. Primary 60H30, 94A17; secondary 91B16, 60G44. 

Key words and phrases. Enlargement of filtration, logarithmic utility, utility maximization, het- 
erogeneous information, insider model, Shannon information, information difference, entropy, dif- 
ferential entropy. 


743 


744 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


model, just indicating the most important mathematical techniques used for its in- 
vestigation. Methods are focused on martingale and stochastic control theory, and 
techniques of enlargement of filtrations (see [21]), starting with the conceptual pa- 
per by Duffie and Huang [10], mostly in the initial enlargement setting, that is, the 
insider gets some fixed extra information at the beginning of the trading interval. 
The model is successively studied on stochastic bases with increasing complexity: 
that is, Karatzas and Pikovsky [23] on Wiener space, Grorud and Pontier [14] allow 
Poissonian noise and Biagini and Oksendal [6] employ anticipative calculus tech- 
niques. In the same setting, Amendinger, Becherer and Schweizer [1] calculate the 
value of insider information from the perspective of specific utilities. Baudoin [5] 
introduces the concept of weak additional information consisting in the knowledge 
of the law of some random element. Campi [7] considers hedging techniques for 
insiders in the incomplete market setting. It is clear that the expected utility the 
insider is able to gain from final wealth in this simple model will be bigger than 
the uninformed traders' utility, for every utility function. And, in fact, many of 
the quoted papers deal with the calculation of a better informed agent's additional 
utility. | 

In [3], in the setting of initial enlargements and logarithmic utility, a crucial 
and natural link between the additional expected logarithmic utility and informa- 
tion theoretic concepts was made. The insider's logarithmic utility advantage is 
identified with the Shannon entropy of the additional information. In the same set- 
ting, Gasbarra and Valkeila [15] extended this link by interpreting the logarithmic 
utility increment by the Kullback—Leibler information of the insider’s additional 
knowledge from the perspective of Bayesian modeling. In the environment of this 
utility-information paradigm, the papers [2, 8, 17-20] describe additional utility 
and treat arbitrage questions and their interpretation in information theoretic terms 
in increasingly complex models of the same base structure, including some sim- 
ple examples of progressive enlargements. It 1s clear that utility concepts different 
from the logarithmic one correspond on the information theoretic side to the gen- 
eralized entropy concepts of f -divergences. 

In this paper we shall continue the investigation of mathematical questions re- 
lated to the link between utility and information theory in the most general setting 
of enlargements of filtrations: besides assuming eventually that the base space be 
standard, to ensure the existence of regular conditional probabilities, we shall let 
the filtration of the better informed agent just contain the one of the natural evolu- 
tion of knowledge. To concentrate on one kind of entropy in this general setting, we 
shall consider logarithmic utility throughout. In this framework, Ankirchner and 
Imkeller [2] calculate the maximal expected utility of an agent from the intrinsic 
point of view of his (general) filtration, and relate the finiteness of expected utility 
via the (NFLVR) condition to the characterization of semimartingales by the the- 
orem of Dellacherie-Meyer-Mokobodski. The compensator in the Doob-Meyer 
decomposition of underlying asset price processes with respect to the agent's fil- 
tration is determined by the information drift process. In this paper we shall give a 


SHANNON INFORMATION AND ADDITIONAL UTILITY 745 


general analysis of the nature of this process, and relate it to measuring the differ- 
ence of the information residing in the two filtrations, independently of the partic- 
ular price dynamics. The basic observation we start with in Section 2 identifies the 
information drift process with Radon—Nikodym densities of the stochastic kernel 
in an integral representation of the conditional probability process and the con- 
ditional probability process itself. This observation allows for an identification of 
the additional utility by the information difference of the two filtrations in terms 
of Shannon entropy notions in Section 5, again independent of particular price 
dynamics of the financial market. 

The paper is organized as follows. In the preparatory Section 1 we recall the 
main results about the connection between finite utility filtrations, properties of 
the price dynamics from the perspective of different agents and properties of the 
information drift from [2]. In Section 2 (Theorems 2.6 and 2.10) properties of the 
conditional probability processes with respect to the agents' filtrations and the in- 
formation drift process are investigated in depth, and lead to the identification of 
the information drift by subjective conditional probability quantities. The descrip- 
tion of the additional utility in terms of entropy notions is more easily obtained, 
if the additional information in the bigger filtration comes in discrete bits along a 
sequence of partitions of the trading interval, leading to stepwise "initial enlarge- 
ments" which ultimately converge to the big filtration as the mesh of the partitions 
shrinks to 0. This is done in Section 5 (Theorem 5.8), after being prepared in 
sections 3 and 4 by a general investigation of the convergence properties of infor- 
mation drifts going along with the convergence of such discretized enlargements 
to the big filtration. In the final Section 6 general facts known from Shannon infor- 
mation theory (see [16]) are applied to estimate the expected maximal logarithmic 
utility of a better informed agent via the identification theorem of Section 5, in sev- 
eral particular cases. Entropy maximizing properties of Gaussian random variables 
play an important role. 


1. Preliminaries. In this preparatory section we define the financial market 
model and recall some basic facts about expected utility maximization. Our fa- 
vorite utility function will be the logarithmic one, for which we will then compare 
the maximal expected utilities,of agents on the market who act on the background 
of asymmetric information. Recalling a result from [2], we will describe the util- 
ity increment of a better informed agent by the respective information drift of the 
agents' filtrations. 

Let (82, F, P) be a probability space with a filtration (F;)o<:<7, where T > 0 
is a fixed time horizon. We consider a financial market with one nonrisky asset of 
interest rate normalized to 0, and one risky asset with price 5; at time t € [0, T]. 
We assume that S is a continuous (¥;)-semimartingale with values in R and write 
“A for the set of all S-integrable and (¥;)-predictable processes such that 09 = 0. 
If 0 € A, then we denote by (0 - S) the usual stochastic integral process. For all 


746 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER ` 


x > Q, we interpret 
x 4- (0 - S), 0<t<T, 


as the wealth process of a trader possessing an initial wealth x and choosing the 
investment strategy 0 on the basis of his knowledge horizon corresponding to the 
filtration (¥;). Throughout this paper we will suppose the preferences of the agents 
to be described by the logarithmic utility function. Furthermore, we suppose that 
the traders’ total wealth has always to be strictly positive, that is, for all t € [0, T], 


(1) | x T (08:5), -0 a.s. 


Strategies 0 satisfying equation (1) will be called x-superadmissible. The agents 
want to maximize their expected logarithmic utility from their wealth at time T. 
So we are interested in the exact value of 


u(x) = sup| E log(x + (0 - S)r):6 € A x-superadmissible}. 


Sometimes we will write ug (x) in order to stress the underlying filtration. The 
expected logarithmic utility of the agent can be calculated easily, if one has a semi- 
martingale decomposition of the form 


i 
(2) $ Mi | nd, Mys, 


where 7 is a predictable process. Such a decomposition is given for a large class 
of semimartingales. For example, if S satisfies the property (NFLVR), then it may 
be decomposed as in (2) (see [12]). As is shown in a forthcoming Ph.D. thesis [4], 
finiteness of u(x) implies already such a decomposition to exist. Hence, a decom- 
position as in (2) may be given even in cases where arbitrage exists. We state 
Theorem 2.9 of [2]. 


PROPOSITION 1.1. Suppose S can be decomposed into S = M +n- (M, M). 
Then for any x > 0, the following equation holds: 


T 
(3) u(x) =logx + 1E I "1 d(M, M)s. 
This proposition motivates the following definition. 


DEFINITION 1.2. A filtration (Qr) is called a finite utility filtration for S, if S 
is a ($,)-semimartingale with decomposition dS = dM -- £ -d(M, M), where £ is 
($,)-predictable and belongs to L?(M), that is, E fo £2 d(M, M) < oo. We write 


E = ((26) D (FYH) is a finite utility filtration for 5]. 


SHANNON INFORMATION AND ADDITIONAL UTILITY 747 


We now compare two traders who take their portfolio decisions not on the basis 
of the same filtration, but on the basis of different information flows represented 
by the filtrations (4,;) and (J6), respectively. Suppose that both filtrations ($r) and 
(H) are finite utility filtrations. We denote by 


(4) S=M+¢-(M,M) 
the semimartingale decomposition with respect to (Y;) and by 
(5) S=N +P: (N,N) 
the decomposition with respect to (J£;). Obviously, 
(M, M) = (S, S) = (N, N) 


and, therefore, the utility difference is equal to 


T 

ux) —ug(x) =5E | (6? - 40, M). 
Furthermore, (4) and (5) imply 
(6) M —N —(t —p)- (M, M) a.s. 


If 9, C H, for all t > 0, equation (6) can be interpreted as the semimartingale 
decomposition of M with respect to (7£;). In this case one can show that the utility 
difference depends only on the process u = ¢ — f. We therefore use the following 
notion. 


DEFINITION 1.3. Let ($;) be a finite utility filtration and $ = M --£- (M, M) 
the Doob-Meyer decomposition of S with respect to (G;). Suppose that (J6,) is 
a filtration such that 9; C J£, for all ? € [0, T]. The (#,)-adapted measurable 
process ‘ satisfying 


M — I | i4 d (M, M), is a (36,)-local martingale 
is called information drift (see [19]) of (J6,) with respect to (8,). 


The following proposition relates the information drift to the expected logarith- 
mic utility increment. 


PROPOSITION 1.4. Let (8,) and (J€;) be two finite utility filtrations such that 
9, C Hı for all t € [0, T]. If u is the information drift of (J6) w.r.t. (Gi), then we 
have 


T 
uge(X) — ug(x) = 1E Í u2d(M, M). 


748 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


PROOF. See Theorem 2.13 in [2]. O 


So far we only required the information drift to be measurable and adapted. Due 
to the continuity of S, we have the following. 


PROPOSITION 1.5. The information drift, provided it exists, may be chosen to 
be predictable. 


PROOF. Suppose p is a measurable and (9$,)-adapted process such that 
M -f aM, M)t 


is a (Y;)-local martingale. We denote by ? u the predictable projection of u with 
respect to ($,). We will show that M — Py - (M, M) remains a ($,)-local martin- 
gale. 

Let t be a stopping time localizing M such that M”, the martingale M stopped 
at t, is bounded. To simplify notation, we assume M* = M. LetO<s «t, Á € Qs 
and £ > 0. Then 


E(14(M; — Mes) = E(ta [ue tbt, My) 


(uf, 
E(1a la pr d(M, My; |) 
( 


Elia e| f f "p, dM, M)rlGs |) 


I 


t 
- E(t4 [ Puram, My.) 
S+eé 


(see Theorem 57, Chapter VI in [11]). By dominated convergence, the left-hand 
side of this equation converges to E(14(M; — M;)) as e 4 0. The right-hand side 
converges by similar arguments. Hence, we obtain 


H 
E(14(M; ET M;)) = E(14 f Pu, (M, M),), 
M 
which means that M — P u - (M, M) is a (G:)-martingale. LJ 
We close this section by recalling some basic properties of information drifts. 


LEMMA 1.6. Suppose the filtration (5) is a finite utility filtration with respect 
to which the Doob-Meyer decomposition of S is given by S = M +n- (M, M). Let 
(H) be a filtration satisfying 9, C Hı for all t € [0, T] and suppose that (3€;) 
has an information drift u with respect to (f). Then the following properties hold 
true: 


SHANNON INFORMATION AND ADDITIONAL UTILITY 749 


(i) If u belongs to L?(M), then the maximal expected utility u g(x) is finite 
for all x > 0. 

(ii) The set of finite utility filtrations F is equal to the set of all filtrations con- 
sada! (3) and possessing an information drift X. with i oid to (3) such that 
X € L^(M). 

(ii) Jf (J6,) is a finite utility filtration, then p is mm to L? $- (M), the 
subspace of (¥;)-predictable processes in L?(M). 

(iv) If ($1) is a filtration such that F; C Gt C Fe; for all t € [O, T], then there is 
also an information drift k of ($1) with respect to (F,). More precisely, k is equal 
to the L?(M)-projection of u onto the subspace of the ($,)-predictable processes. 


PROOF. Properties (i) and (ii) are obvious. For property (iii), let 5 = N + 
B - (N, N) denote the Doob-Meyer decomposition of S relative to (J£), and let 
0c L- (M). Since 0 is adapted to both (F;) and (26), the integrals (0 - M) and 
(0 - N) are square integrable martingales with expectation zero. Therefore, 


ef 6ud(M, M) = elf op d(M, m- f n dM, LI 
=ef sam- f paN] 
26 


Thus, yz is orthogonal to L^ ¢ (M). For property (iv), we refer again to [2]. O 


2. General enlargements. Assume again that the price process 5$ is a semi- 
martingale of the form 


S — M +n (M, M), 


with respect to a finite utility filtration (¥;). Moreover, let (Q+) be a filtration such 
that F; C Qr, and let a be the information drift of (9,) relative to (2). And, for 
simplicity of notation, suppose in this section that time horizon is infinite, that is, 
T = co. We shall aim at describing the relative information drift a by basic quanti- 
ties related to the conditional probabilities of the larger o-algebras 3, with respect 
to the smaller ones ¥;, t > 0. Roughly, modulo some tedious technical details to be 
specified below, the relationship is as follows. Suppose for all t > 0 there is a reg- 
ular conditional probability P,(.,-) of F given ¥;, which can be decomposed into 
a martingale component orthogonal to M, plus a component possessing a stochas- 
tic integral representation with respect to M with a kernel function k;,(-, -). Then 
we shall see that, provided o is square integrable with xespect to d(M, M) & P, 
the kernel function at t will be a signed measure in its set variable. Moreover, 
this measure is absolutely continuous with respect to the conditional probability, if 
restricted to 8,, and œ coincides with their Radon—Nikodym density. 


750 S. ANKIRCHNER, 5. DEREICH AND P. IMKELLER 


We shall even be able to show that this relationship also makes sense in the 
reverse direction. Roughly, if absolute continuity of the stochastic integral kernel 
with respect to the conditional probabilities holds, and the Radon-Nikodym den- 
sity is square integrable, the latter will turn out to provide an information drift o in 
a Doob—Meyer decomposition of S in the larger filtration. 

We shall finish the section with an illustration of this fundamental relationship 
by discussing some simple examples of particularly enlarged filtrations. 

The discussion of the details of this fundamental relationship requires some 
care with the complexity of the underlying filtrations and state spaces. Of course, 
the need to work with conditional probabilities first of all confines us to spaces 
on which they exist. Let therefore (Q, F, P) be a standard Borel probabil- 
ity space (see [22]) with a filtration (F,°),>0 consisting of countably generated 
c-algebras, and M a (F,°)-local martingale. We will also deal with the small- 
est right-continuous and completed filtration containing (FP), which we denote 
by (¥;). We suppose that Fo is trivial and that every (F;)-local martingale has a 
continuous modification. Since F? is a subfield of a standard Borel space, there 
exist regular conditional probabilities P, relative to the o-algebras F,°. Then for 
any set A € £^, the process 


(f, œ) H> Pi(o, A) 


is an (F,°)-martingale with a continuous modification (see, e.g., Theorem 4, 
Chapter VI in [11]). Note that the modification may not be adapted to (FP), 
but only to (¥;). Furthermore, it is no problem to assume that the processes 
P, (^, A) are modified in a way such that P,(c,-) remains a measure on F for 
Py-almost all (c, t), where Py is a measure on Q x R defined by P4(T) = 
E fy Irw, t)d(M, M), T EF & Bx. 

It is known that each of these martingales may be uniquely written (see, 
e.g., [24], Chapter V) 


H 
(7) GA) - P(A) + | ks(-, A dM; + LA, 


where k(-, A) is (¥;)-predictable and L^ satisfies (L^, M) = 0. 
Now let (89) be another filtration on (Q, F, P) satisfying 


FP CG 


for all 0 <t < T. We assume that each o-field $9 is generated by a countable 
number of sets, and denote by (4;) the smallest right-continuous and completed 
filtration containing (¢°). It is clear that each o-field in the left-continuous fil- 
tration (¢°_) is also generated by a countable number of sets. We claim that the 
existence of an information drift of (Q+) relative to (F+) for the process M depends 
on whether the following condition is satisfied or not. 


SHANNON INFORMATION AND ADDITIONAL UTILITY 751 


CONDITION 2.1. k;(o,-) go is a signed measure and satisfies 
k;(o, lgo « Pw, )lgo- 


for Py-a.a (o, t). 


REMARK 2.2. Unfortunately, we have to distinguish between the filtrations 
(FP), (80) and their extensions (¥;), (9;). The reason is that the regular condi- 
tional probabilities considered exist only with respect to the smaller o-fields. On 
the other hand, we use stochastic integration techniques which were developed 
only under the assumption that the underlying filtrations satisfy the usual condi- 
tions, and this necessitates working also with the larger o -fields. 


Let us next state some essential properties of the Radon-Nikodym density 
process existing according to our condition. 


LEMMA 2.3. Suppose Condition 2.1 is satisfied. Then there exists an (3, Q 
$.;)-predictable process y such that, for Py-a.a. (c, t), 
dk; (a, -) 


7 
dE) s 


yt (c, c) ux 


REMARK 2.4. Note that y; (o, -) is $;--measurable. This is due to the fact that 
the predictable o -algebra does not change by taking the left-continuous version of 
the underlying filtration. 


PROOF OF LEMMA 2.3. Lett’ = A for all n > 0 and i > 0. We denote by T 
the set of all £. It is possible to choose a family of finite partitions (425") such 
that: 

e for allt € T, we have $89. — o (£^ :i, n > 0 s.t. t” =t), 

e pin C Pitln 

e ifi<j,n <m andi? ™” = j27" , then PH” c Pi”, 

We define, for all n > 0, 

ki (c, A) 


fee } 
y; (0,0) = 25 2. Tyr, (1A C Pi(@, A) 


i20 Ac pin 





Note that + Wee is (F;)-predictable and 1j» ; )(t).4(w’) is (Gr)-predictable. 


Hence, the product of both functions, defined as a function on Q? x R4, is pre- 
dictable with respect to (F; @ $,). It follows that each y” and, thus, 


ENS EE n 
y =liminfy , 


is (F; ® $,)-predictable. 


752 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


Now fix t > 0. We claim that k; (œ, -) = f. ye (@, œw) P;(@, dw’) and, hence, that 
yr (o, -) is the density of k: (c, -) with respect to P;(c, -), Py-a.s. For all n > 0, 
let j = j (n) be the integer satisfying t; < t < £7, , and denote by Q" the corre- 
sponding partition P”, Observe that (Q^) is an increasing sequence of partitions 
satisfying 

c (Q”:n > 0) = 9» 
and, hence, 
yr(oo, w) = lim inf y; (o, w) 


ki(w, A) — dk, ) :) 
P; (o, A) "M "T [] 





= lim int py 1A(o) —————- 


AEQ" 





LEMMA 2.5. If (t,@,@') +> 8w, w’) is (F; Q 9;)-predictable and bounded, 
then 


[ [ [ o. Bio. ao am, My d P) — | | 6o, o) dM, My dP(o). 


PROOF. LetO<r<s,AcFf,, B Egr and 
0, (co, w) = 1,51 (£)14 (0)1 p (o). 
Then 
f J J & (o, «/) P; (o, do") d(M, M), d P(w) 


eff 14(2) P,(@, B)d(M, M);dP(o) 
" | Í "1a(o)ip(o) d(M, M), dP (0) 


= | | 6o. au. My dP (o), 


where the second equality holds due to results about optional projections (see The- 
orem 57, Chapter VI in [11]). By a monotone class argument, this can be extended 
to all bounded and (F; & $,)-predictable processes. L] 


THEOREM 2.6. Suppose Condition 2.1 is satisfied and y is as in Lemma 2.3. 
Then 


Or (@) ae (c, w) 


is the information drift of (4) relative to (3). 


SHANNON INFORMATION AND ADDITIONAL UTILITY 753 


PROOF. Suppose t to be a stopping time such that M* is a martingale. For 
0x s «t and A € $2, we have to show 


E[14(MI — MB] = E14 f OMT ie M. 
For notational simplicity, write M* — M and observe 
E[14(M; — Mj;)] = E[P;C, A) Mi: — Ms)] 
= E| au — Me) [ kc, A) dMa | + ELM, - MOL 


= E f ma ns] 
=E| f f yuo, o apu tos de) att, M), | 


t 
_ E140) J noat Mil, 


where we used Lemma 2.5 in the last equation. H 
COROLLARY 2.7. ($1) is a finite utility filtration if and only if 


JI] y; (w, œ) P(w, dw") d(M, M) d P(w) < oo. 
PROOF. This follows immediately from Lemma 2.5. L] 


We now look at the problem from the reverse direction. Starting with the 
assumption that (8,) is a finite utility filtration, which amounts to E i a^ d(M, 
M) « oo, we show the vahdity of Condition 2.1. 

In the sequel, (G,) denotes a finite utility filtration and œ its predictable infor- 
mation drift, that is, 


(8) M=M— | «du. My 


is a ($;)-local martingale. To prove the main results (Theorems 2.10 and 2.12), we 
need the following lemma. 


LEMMA 2.8. LetO<s «t and P ={A},..., An} be a finite partition of Q 
into 99-measurable sets. Then 


E S je Axl, d(M, M), <4z(f adam, My.) « oo. 


“S kl) U 


754 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 
PROOF. Let P —[A1,..., An} bea finite G9-partition. An application of Itó's 
formula, in conjunction with (7) and (8), yields 


n 


$n, log P;(-, Ax) — 14, log P;C, Ax)] 


k=] 
n t 1 P 
= — į ——] PA 
>| j Py, Ak) Ar ( k) 
1" o] 
+5 ETM d (PC, Ap, PC, Aou 
(9) =) |- [> ^ C. Apia, dM, — -fz — C. Az) dato d(M, M) 
t ] A 
-f pagt dli 


+5 ef (+ ‘VG ADLA d(M, M), 


1 
— ————M— If d(L^?* , L^ | 
+z PC Ai idt Ju 


Note that P;(-, Ag) log P;(-, Ag) is a submartingale bounded from below for 
all k. Hence, the expectation of the left-hand side in the previous equation is at 
most 0. 

A priori it is not clear whether 


X6 , Aj)14, dM, 
2- =1 "5 


is integrable or not. Consider therefore, for all € > 0, stopping times defined by 
"- | oo, w € Ax, 
* ^ linf{t>s:P¢, An) <e}, else, 


and 
B Low € 
T =T AAT, 


Observe that tê — oo as € | 0 and that the stopped process 


E | EG Anya, dM, 


SHANNON INFORMATION AND ADDITIONAL UTILITY 755 


has expectation zero, since 


Jf" xe T—À ) 
=e wy H(z 3l C, Ap1a, dM, M. 


1 
<5 E KOK Arta, d(M, M. 


1 n 
S5 P» d (P, Ax), PC. A 
k=1* $ 
<= OQ. 


Similarly, one can show that the expectation of 


[^Tt 1 
—— dus gr 
i Bod E 


vanishes. Consequently, we may deduce from (9) and the Kunita-Watanabe in- 


equality 
tAt® 
E Ti (> an A14, d(M, M), 
NT. ky 
< n l Fal Apa ou dM, My. 


tare OR dis 
«s i pa 3] C, Apa, d(M, n.) 


which implies 
tare 2 ATE 
E | NES 3: Ania, d (M, M), s 4E( f a2 d(M, M), . 
S 


Now the m". may be completed by a monotone convergence argument. L] 


Let T and (.2^"); n>0 be a family of partitions as in the proof of Lemma 2.3. 
We define, for all n > 0, 


Zi(o,o)= >> > Lye O14 (o> 7—— 


i20 Ac Pin 


ki (œ, A) 
Pi(@, A) 


756 S. ANKIRCHNER, 5. DEREICH AND P. IMKELLER 


Note that Z” is (F; & $;)-predictable. We are now able to prove a converse state- 
ment to Theorem 2.6. Observe first the following: 


LEMMA 2.9. For Py-almost all (w,t) € Q x Ry, the discrete process 
(ZI (o, ))m»1 is an L^(P; (v, -))-bounded martingale. 


PROOF. Every statement in the sequel is meant to hold for Pj-a.a. (@, t) € 
Q x Ri. 
Let m > 0, 1 > 0 and j be the natural number such that Jz, ps dd C 


It t it We start by proving that on Ines fr. 44 +h we have 


ES [Z7 (op, NIPI] = Z" (o, j. 
For this, let B € P” and Ay,..., Ay € Pot! such that Ay U---U A; = B. Note 
that 


k 
k 
BPO RZ, )] = gh? E La OF, A) 
t 


i-i 
k 
T? 3 o, Aj) 
i=l 


= Ko, B) 
= Eh? 5(ZP (o, )] 


on ]f, pu ; fj ol. Consequently, the process (Z7' (w, -))m>1 is a martingale (with 


Reel to a filtration depending on ¢). The martingale property implies that the 
sequence f (Z?) o, w) Pj (c, dw’) is increasing and, hence, by monotone conver- 
gence, 


sup E | | (Z")?(, o") Py (œ, do’) d(M, My, 
=E | sup | (Z")*(@, o) P, (@, dw’) d(M Myr. 
n 
By Lemmas 2.8 and 2.5, we have 


sup E | | (Z")2(co, wo!) Palo, deo!) d(M, M), 


-sup E | (ZH, 0)d(M, My, 


-spEY [^ Y^ Lato (ee 2j d(M, M), 


i0 f Acpin 


< «(f a? d(M, My.) « 00 


SHANNON INFORMATION AND ADDITIONAL UTILITY 757 


This shows that (2”),>1 1s an L*(P;(w, -))-bounded martingale. |] 


We now will show that k can be chosen to be a signed measure. For this we 
identify P;(@, -) with another measure on a countable generator of $0 . We then 
apply the result that two Banach space valued measures are equal, if they coincide 
on a generator stable for finite intersections. 

THEOREM 2.10. The kernel k may be chosen such that 

99 >A k(o, A) eR 
is a signed measure which is absolutely continuous with respect to P;(@, ‘Igo : 


for Py-a.a. (@, t) € Q x [0, co). This means that Condition 2.1 is satisfied. 


PROOF. Lemma 2.9 implies that (Z7 (c, -))m>1 is an L? (P; (o, -))-bounded 
martingale and, hence, for a.a. fixed (œw, t), (Z;"(@, -))m>1 possesses a limit Z. 
It can be chosen to be (F; & %;)-predictable. Take, for example, 


Z = liminf(Z; v 0) + limsup(Z7 ^ 0). 
' n 
Now define a signed measure by 
(o, A) = | 149 Zio.) dP; Go, do’). 


Observe that k; (o, ‘) is absolutely continuous with respect to P;(@, -) and that we 
have, for all A € £^" with j2 " <t, 


k, (o, A) = k; (c, A) 


for Py-a.a. (o, ty € € x R+. One may also interpret 90 > At k (v, A), as an 
L^(M)-valued measure. By applying the stochastic integral operator, we obtain an 
L?(Q)-valued measure: 90. >APK- h ks(@, A) d M;. Moreover, 


(10) P,(o, A) = P(A) + [ ks(@, A) dM; + LA) 


for all A € Uj2-m<, 7^". Since the LHS and both expressions on the RHS are 
measures coinciding on a system which is stable for intersections, (10) holds for 
all A € 90 . Hence, by choosing k;(-, A) = ki, A) for all A € Q0 , the proof is 
complete. LI 


REMARK 2.11. Since k is determined up to Py-null sets, we may assume 
that k; (c, -) is absolutely continuous relative to P; (c, -) everywhere. 


758 S. ANKIRCHNER, 5. DEREICH AND P. IMKELLER 


We close this section with some examples showing how (well-known) infor- 
mation drifts can be derived explicitly, based on the formalism of Theorem 2.6. 
To this end, it is not always necessary to determine the signed measures Kk, (o, -) 
on the whole o-algebras g9, but only on some sub-o -fields. This is the case, for 
example, if 


90—29v39,  O<t<T, 


where (J€°) is some countably generated filtration on (Q, F). 
Now suppose that k; (w, -) is a signed measure on (#¢2_) satisfying 


kr, Jye K Pilo, Ji po. 
for Py-a.a (c, t). Then we can show with the arguments of the proof of Lemma 2.3 
that there is an (F; & J6;)-predictable process P such that Pyy-a.e. 
| dk(o, -) 
— aP;(@,:) p 
The information drift of (9;) relative to (¥;) is already determined by the trace 


of (B,). For the corresponding analogue of Theorem 2.6, we shall give a more 
explicit statement. 


fi (o, o») (w). 


THEOREM 2.12. The process 
ar (w) = Pr (w, w) 
is the information drift of (G+) relative to (F;). 


PROOF. Suppose T to be a stopping time such that M^ is a martingale. For 
0«s «t, A € H and B € £9, we have to show 


t 
E[1415(MI — MP] = | tate | By(w, @)d(M, M)! I 


For simplicity, assume M? = M, and observe, like in the proof of Theorem 2.6, 


E[1 415 (M; — Mj] = Elta PC, A)(M; — Mj) 
t 
= [140120 | puoro dM, Mu. p 


EXAMPLE 2.13. Let (W;) be the standard Wiener process and (£9) the fil- 
tration generated by (W;). Moreover, let (Y;) be a Gaussian process independent 
of 7 such that, for each pair s, t with 0 < s < t, the difference Y; — Y, is indepen- 
dent of Y;. We denote by w; the variance of Y;. 

We enlarge our filtration by 


HO =o (Wi --Y,:0€s x1) 2o(Wi- Y) vo(Y; - Y,:0x s <t), 


SHANNON INFORMATION AND ADDITIONAL UTILITY 759 


and put Qo = E V HO, 0 « t < 1. Now observe that, for all C € o (Y; — Y,:0-« 
s < t) and Borel sets B € B(R), we have 


P,C, (Wi +Y; € B}NC) = P(C) | ip(x + Wi — W, - Y)dP 





x=W, 





— P(C) | 156 + x) b1-1-4+w, (y) dy -— 


— P(C) f, di-y- W)dy, | 0st «1, 


where 


I 2 
m —y*/Qv) 
dy (y) axo (27 v)Uz* $ 


Now observe that f (x, m = P(C) fp $1—t+w, (Y — x) dy is differentiable in x and 
satisfies 


— fe =PO | Lti — 2) dy 
for all 0 « t < 1 and x € R. By Itó's formula, 
tg 
AG {Wi + Y, e B}NC) = f0,0)+ | -f(We,8) dW Ar, — 0st 
0 X 


where A is a process of bounded variation. Note that A is also a martingale and, 
thus, A = 0. Hence, 


k C, {Wi + Y; E B} NC) 


— W, 
= PO f mw (y — W,) dy 


— P(C) | tay +x)2 i- iu) dy 
[Tw x=W; (o) 
Wilo + Y,(o) — W 
z f Wi!) + Yee) - Wo) dP.(w, do’). 
{Wi+¥;EB}NC l-—t+uw, 


As a consequence, 





n klw, do’) Wi lw) + Yr(o») — Wi(o) 
bo, o) = MÀ ee 
P; (w, dw’) | 30 1—t+w; 
and by Theorem 2.12, 
'Wi--Y,—W 
w- [ ee edu Oeri 


is a martingale relative to (8,). 
similar examples can be found in [8] where the information drifts are derived 
in a completely different way though. 


760 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


EXAMPLE 2.14. Let (W;) be the standard Wiener process and (¥;) the 
Wiener filtration. We use the abbreviation W,* = supg.,.-, Ws and consider the 
filtration enlarged by the random variable G = 1,0, (WT), c > 0. Again, we want 
to apply Theorem 2.12 in order to obtain the information drift of $= F; v o (G). 
To this end, let Z; = sup, z, .,(W, — W;) and denote by p; the density of Z;, 
0 x t « 1. Now, n 


P(,G—1) = P(W V Wi + Zi < e| $2) 


= J 15.46 Vx+Z,)dP 





x= W, y= we 





CX 
= lio y) | pi(z) dz ; 
0 x=W; QW 


for all 0 <t < 1. Note that F(x, y, t) = 110,1 (Y) fg. ^ pi) dz is differentiable in 
x for all 0 x t < 1 and x € R, and by Itó's formula, 


.Bj(,G — 1) = F(0,0,0) + f FOU W?,s)dW; + Ar, 0zt«l, 

where A is a process of bounded variation. Hence, 

ki, Go 1) = FW, WAD, O<t<1l. 
Similarly, we have 

PC, G = 0) = HOV,,W7 t), 0xt«]1, 

and 

ke G0) = HOY, WO. 0xt«l, 
where 

oo 
H (x, y, f) = 16,09) Y) + 110,7] (Y) f QP (z) dz. 


As a consequence, 


ky (@, do) 


w, w) = 
CONS edu kie 


0 
= Ly (Go) —— log F(W; (c), W (o), t) 


+ 1G log H(W: (w), W7 (œ), t), 0xt «l1. 


SHANNON INFORMATION AND ADDITIONAL UTILITY 761 


3. Monotone convergence of information drifts. In the preceding section 
we established a general relationship between the information drift and the regu- 
lar conditional probabilities of filtrations. In this framework the knowledge of the 
better-informed agent 1s described by a general enlarged filtration (Qr) of (F7). We 
shall now consider the question whether this situation may be well approximated 
by “stepwise initial” enlargements, for which we take F; v Gy- for t € [tj, ti+1), 
if the family (f;)o«j« is a partition of R+. One particularly important question in 
this context concerns the behavior of the information drifts along such a sequence 
of discretized enlargements. Of course, we expect some convergence of the drifts. 
We shall establish this fact rigorously in the following section. In the present sec- 
tion, we shall prepare the treatment of this problem by solving a somewhat more 
general problem. Let ($7),ew be an increasing sequence of finite utility filtrations 
and sup, ug» (x) be finite. We will show that the smallest filtration containing every 
($7) is then also a finite utility filtration. 

Since we will not deal with regular conditional probabilities in this section, it is 
not necessary to require our probability space (Q, F) to be standard. 

We use the terminology of Revuz and Yor [24]: H 2($,) denotes the set 
of L*-bounded continuous (¥;)-martingales, that is, the space of continuous 
(2, P)-martingales M such that 


sup E (M2) < OO. 
t>0 
We need the following characterization of H7(F;). 
LEMMA 3.1 (Proposition 1.23 in [24]. A continuous (F¥;)-local martingale 
belongs to H?(F;) if and only if the following two conditions hold: 
(i) E(MZ) < oo, 
Gi) E((M, Mæ) < co. 


The properties (i) and (11) are independent of the filtration considered. This is 
due to the fact that the quadratic variation of M does not change under a new filtra- 
tion (4) for which M is still a semimartingale. We therefore have the following: 


LEMMA 3.2. Suppose M € H?(F;). Let (8) be a filtration such that M is 
still a (8,)-semimartingale. If 
M=M+A 
is a Doob—Meyer decomposition with respect to (84) with Ag = 0, then M belongs 
to H*($.). 


PROOF. Notice that Mp = Mg and (M, M) = (M, M). The claim follows now 
by applying Lemma 3.1 twice. (1 


762 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


Now let M be a continuous (¥;)-local martingale and (87), an increasing 
sequence of filtrations, that is, for all t > 0, we have 


$5 c$ic- CH CRC 


We assume that, for all n > 1, the process M is a ($7)-semimartingale with Doob— 
Meyer decomposition of the form 


M=M"+ | itd, Mys, 
where u” is (Qy )-predictable. We then have the following asymptotic property. 
LEMMA 3.3. Ifthe processes (u"),cw converge to some u in L^(M), then 
M- f usd(M, M)s 
is a local martingale with respect to Ge = \/n>1 G7, t = 0 


PROOF. Suppose the stopping time t reduces M such that M* is a bounded 
martingale. Note that Lemma 3.2 implies that the stopped processes (M")* are 
(8$. )-martingales. 

For simplicity, we assume M* = M. Choose a constant C > 0 such that 


oQ 
IM] <C and E | (u")2d(M,M),<C2  foralln>1. 
0 
Now let e > 0,0<s <t and A € G,. It suffices to show 
t 
E[14(M; — Mj)] — E|14 f usam, M), | <e 
S 


We start by choosing no such that 


lu” — ulr € JRA a) arm 








for all n > ng. 

Note that Un>n. $s is an algebra generating the o-algebra \/,>1 95 = $s = 
Vn>no $s- Hence, we can find a sequence (Ai)ieN of sets in Un>n, $5 such that 
P(A ^ Aj) — 0. A subsequence of (1 Ai ieN converges to 1,4 almost surely and, 


therefore, we may choose n > no and Ae gx satisfying P(A A A) < (qe £.)* and 


Hence, we have 
|E[14 (M; — Ms)] — EI1 2 (M; — M;)]| < |ELG a — 1a) — M3] 


< P(A ^ AJ (E(M, — My)? 


€ 
Z uus 


ae 


SHANNON INFORMATION AND ADDITIONAL UTILITY 763 


By applying the Kunita-Watanabe inequality, we get, for n > no, 





| [na [ Hu d (M, My -efu f uy d (M, M)u 








< [1a f Qiu- matt. Myu *|E| a -19 | tau, My 


< (z | lad(M, M). (z [^o p m)” 


n (s fas - 19^ dM, m) (ef uva, M) 


1/2 


€ 
<(E(M, MY la = i" laor + 4 
E 
< AI 
T 


and thus, 





l t 
letna; - Ml - E14 f mam, My. 


< |EI1A(M; — M;)] — EIL 4 (Mt — Mj)II 





t 
4 ECL (OH; - Mj— El; Í us dM, My 





t t 
+felag f widim, Mya] - E [14 f mam, Ms 


= 
We are now in a position to prove the main result of the section. 


THEOREM 3.4. If sup,> | u^ l7 (M) < 99 then (u") converges in L?(M) to 
a process u. Moreover, 


M- | nam. M) 


is a local martingale with respect to Gt = Vn>1 87,1 Z0. 


PROOF. Set c = sup,., llu” leap: Let m > n > 1, and note that u” — u” 
is the information drift of (97) relative to ($7). Therefore, property (iii) in 


" Lemma 1.6 implies 


ll Izan = le” zan + Mr" — e” lzen 


764 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


Thus, c = limy— og ||" iz (M) and 


lu” E irs xs Ie” l 200 um lu” lop SC lu^ l 200 —0 


as n — oo. Therefore, (u"),5-.; is a Cauchy sequence in L^(M). By completeness 
of L?(M), there exists a unique ($;)-predictable process mi € L*(M) such that 
limn—oo 44" = u? in L^(M). By Lemma 3.3, the process M — f w°d(M, M) isa 
(9,)-local martingale. [1 


4. Continuous and initial enlargements. In this section we relate general 
enlargements of filtrations to "initial enlargements" along discrete partitions of 
[O, T], for finite horizon T. The knowledge of the insider is modeled by an arbitrary 
filtration ($;);eo, r], satisfying $; D F;,0<t<T. For s € [0, T], we set 


T E [« s, 
20 Fi V Gs—, t>s. 


Again, the analysis of this section does not require our probability space ($2, F) 
to be standard. 


REMARK 4.1. In the case where the o-field $,.., s € [0, T], is generated by a 
countable number of events, say, (Ay )nen, the enlarged filtration $7 can be viewed 
as initial enlargement at time s in the classical sense. In that case $,.. =o (14, :n € 
N) and one has, for t € [0, T], 


Fi V Gs— = fo Vo(la,:n €N). 


The set (0, 1} can be endowed with a metric so that it becomes a Polish space 
with corresponding Borel—o-field 8((0, 1})®N. Hence, the filtration ($7) can be 
seen as initial enlargement at time s induced by the random variable G: $2 — 
. (06,15, o (4 A,(®))nen- In particular, the standard theory of initial enlarge- 
ments is applicable. See [21]. 


In the following, we assume that (7) is for arbitrary s € [0, T] a finite utility 
filtration. For 0 < s < t < T, we denote 


T 
moll, s) x (t, T] = FG. 2 3E | ui? att. My. 


where (4^ is a ($7)-1information drift. So far xro is defined only on the set J = 
L0, s) x (t, T]:s x t). As the next lemma shows, yro can be extended to a measure 
on the Borel sets of D = {(s,t) e R2:0 s <t <T}. 


LEMMA 4.2. There exists a unique measure m on the Borel sets B(D) of D 
satisfying T |j = T0. 


SHANNON INFORMATION AND ADDITIONAL UTILITY 765 


PROOF. Uniqueness is an immediate consequence of the measure extension 
theorem. In order to show the existence of an extension, it satisfies to verify the 
following property which essentially amounts to countable additivity on a gen- 
erating semiring (see [13], Chapter II, Satz 3.8): For any (s,1) € D and any se- 
quence (Sn, tn)nen in D with s, < s, tn > t and limy ,oo(55, tn) = (5, £), we have 
lim; oo E (Sn, th) = F (s, t). Moreover, F (Sn, tn) € F(s,t) < oo. 

Let 55, tn, $ and t as above. Without loss of generality, we assume that (s,,) is 
monotonically increasing. For u € [t, T], we consider the filtrations ($7")rep,,r], 
n € N, over the time interval [u, T]. The filtrations are monotonically increasing 
with Ven Gr’ = 93, r € [u, T]. Since (u")retu,r] are ($7")-information drifts, it 
follows (by Lemma 1.6) that 


T 
E J (u5 — "yu" d(M, M) =0. 
In particular, 
T T 
E J (u*")! d(M, M) < E | (u5)2d(M, M) < oo. 
u t 


By Theorem 3.4, the processes (11;") re{u,T] Converge to the information drift 
(A )repu,T] in L*(M ; [u, T ]). Therefore, for any u c (t, T], 


T T 
* a $n 2 S 2 
limint E | (u) 4M, M) =E | (KY d(M, M). 


Due to the continuity of M, the right-hand side of the previous equation tends 
to E {qed (M, M) as u | t. Consequently, we obtain IMr oo F (Sn, in) = 
F(s,t). O 


The measure zr describes the utility increase by additional information. As will 
be shown below, 2 (D) is finite if and only if (8;) is a finite utility filtration. 

We now approximate the general filtration (9;) by filtrations that can be seen as 
successive initial enlargements. Let A : 0 = sg <--- < Sn =T,n € N, bea partition 
of the interval [0, T]. We let, for r € [s;, 514.1), 1 — 0, ..., n — 1, 


$^ = Gsi- V Fr. 


PROPOSITION 4.3. For i —0,...,n — 1, let u“ be a (G;')-information drift 
and set ps :— ui forr € [si, si41). Then u^ isa $^ information drift. Moreover, 


1 E Ay? A 
b| eau, y, = (D4), 


where D^ := ((s,t) € D:3i € (0,...,n — 1} with s < s; and t > sj]. 


766 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


PROOF. Itis straightforward to verify that j,^ is an information drift for ($2) 
Moreover, 


T 
JE | uy aun, M) 


n—1l 


-1Y wel (u$ dM, M), 


=! es [ws dM, My, — E 


j= 5i--1 


- (ui) d(M, M), ) 


n—1l 


= $ n([0, si) x (si, 5i41]) = 1 (D^). 
i=0 J 


We can now state the main theorem of this section. 


THEOREM 4.4. Let An, n € N, be a sequence of partitions of the inter- 
val [0, T], the mesh of which tends to 0. If n (D) is finite, then the information 
drifts y, ^^ converge in L?(M) toa ($1)-information drift u. Moreover, the utility 
gain of the insider satisfies 


T 
u$, x) - ux) - 3E]. pĉd(M, M) 20. 
If xt (D) is infinite, then so is the utility gain of the insider. 
The proof of the theorem is based on the following proposition. 


PROPOSITION 4.5. If x (D) < oo, then there exists a ($;)-information drift p. 
Moreover, 


zlalzzan = 2 OD). 


PROOF. Let A,, n € N, be as in the above theorem with the additional as- 
Sr that A441 is a refinement of A, for all n € N. Then one has $^ i 


g®n+! for any t € [0, T]. By Proposition 4.3, ilju» lizan = = (D4) < TD), 


Due to Theorem 3.4, the information drifts 44" converge to a (Ven 6^ 
(9—)-information drift jz in L^(M ja -— monotone convergence, we obtain that 


= An\ — An 12 2 


since every cadlag e is as well a (G,)-martingale, u is a (G,)-infor- 
mation drift. C 


SHANNON INFORMATION AND ADDITIONAL UTILITY 767 


PROOF OF THEOREM 4.4. Assume that zr (D) is finite. Since the mesh of the 
partitions A, tends to zero, one has lims. ,o5 pa, (x) = 1 for all x € D. Conse- 
quently, the dominated convergence theorem yields 


(11) Jim, z(D^") =n(D). 


We established the existence of a (8,)-information drift u in Proposition 4.5. Re- 
call that, by Lemma 1.6, the processes uô” and u — p^" are orthogonal in L^(M). 
Consequently, 


u m ua" TES = Palen m || i," ec: 


Due to (11), the right-hand side of the previous equation converges to 0. Hence, 
uô" converges to u in L^(M). The remaining statements are consequences of 
Propositions 4.5 and 1.4. [.] 


5. Additional utility and entropy of filtrations. In this section we consider 
the link between the additional expected logarithmic utility of a better informed 
agent and the entropy of the additional information he possesses. The additional 
utility was first expressed in terms of a relative entropy in [23], page 1103, for a 
particular example. More generally, [3] discussed the link between the absolute 
entropy of a random variable describing initially available additional information 
and the utility increment of better informed agents. Here we shall see that the 
expected logarithmic utility increment is given by an integral version of relative 
entropies of the o-algebras of the filtration. This notion can best be understood 
as the limit of discrete entropy sums along a sequence of partitions of the trading 
interval as the mesh goes to 0. Alternatively, we are able to give an interpretation 
of the utility increment by Shannon information differences between the filtrations 
of the agents. In particular, we shall see that these differences are independent of 
any local martingales the filtrations may carry. 

Suppose that the assumptions of Section 2 are satisfied. Moreover, we assume 
that M is a continuous local martingale satisfying the (PRP) relative to (¥;), which 
simply means that L^ — 0. Equation (7) simplifies to 


t 
P,(., A) = P(A) + Í ks, 4) 4M;, 


where k(-, A) is as in Section 2. Let again ($0) be a filtration satisfying FP G 99 
and being generated by countably many sets. To simplify notation, we assume the 
filtration (99) to be left-continuous. Let (9;) be the smallest completed and right- 
continuous filtration containing (9°). In the following, we assume that (G,) is a 
finite utility filtration and denote by jz its predictable information drift, that is, 


M -M- | mdM, My 
0 


768 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


is a (8;)-local martingale. Recall that, by Theorem 2.10, we may assume that 
k, (c, -) is a signed measure. For a fixed r > 0, we define u” as the information 
drift of the initially enlarged filtration (97), defined as in the beginning of the pre- 
ceding chapter. For stating the main result we need the following lemma. 


LEMMA 5.1. Let0 x s «t and (P'™)m>0 be an increasing sequence of finite 
partitions such that o (59 :m > 0) = $9. Then 


ime f Y (x -y (, AXtad(M, M), =E [ (RAM, My 
$5 Aepm 


and 


ümE[ Y EC A)lau, d(M, M), =£ f (u$) d(M, M),. 
j Aen P 


PROOF. By Lemma 2.10, the process 


k 
Y; (o, )— 5, 50A),  mzl, 
Acpm u 
is a L*-bounded martingale for Py-a.a. (c, u) € Q x [s, t]. Hence, (Y) converges 
Py -a.s. to the density 
m ky 5 dw’) 
Yu = P C, da’) lgo 





By Theorem 2.6, we have 


Yu (o, w) = Hs, (c) 
Py-a.s. on Q x [s, 7] and, hence, the first result. In a similar way, one can prove 
the second statement. |] 


We next discuss the important concept of the additional information of a o -field 
relative to a filtration. 


DEFINITION 5.2. Let Æ be a sub-o-algebra of F and R, Q two probability 
measures on F. Then we define the relative entropy of R with respect to Q on the 
o-field A by 


dR 
#4 (RQ) = | [ ees ar. if R « Q, 


OQ, else. 


Moreover, the additional information of A relative to the filtration (25) on [s, t] 
(0 x s «t <T) is defined by 


Ha(s,t) = f 3tA(P.(o, IP; (o, )) dP (o). 


je 


SHANNON INFORMATION AND ADDITIONAL UTILITY 769 
The following lemma establishes the basic link between the entropy of a fil- 
tration enlargement and additional logarithmic utility of a trader possessing this 
information advantage. 
LEMMA 5.3. ForÜ0 € s <t, we have 


t 
Hyo(s,t) = 4 | (4) 4M, Myu- 
S 


PROOF. Let (P’")m>o be an increasing sequence of finite partitions such that 
o (£2 :m > 0) = $9. Recall that, by (9) 


D 24 log P,(-, A) — 14 log P;(., A) 


Acjm 


tk "Y t k 
z^ |- | zc Atada- | FC tana d tM, M)s 
S H $ u 


Acpm 
4 T Ala d(M, M) | 
?) å P, , A i üi* 


Since M is a local martingale, we obtain by stopping and taking limits if necessary 


P.C, A) 
E P,(-, A) log 
e. PA) 





1 tf ky V? 
-EY f; Fe tans d tM, My, — [ (FE) 6 014404. Mu 


Acpm 


Note that in the previous line u may be replaced by u’, because (u — u^) is 
orthogonal to L^(M)($5) [see property (iv) in Lemma 1.6]. Applying Lemma 5.1 
yields 
t 
lim Hpm i= iz f (uS) d(M, M). 
S 
Fatou's lemma implies 
lim inf Hpm(s,t) > Hgo (s, t). 
On the other hand, we have Hpm(s,t) < Hgo (s, t), since P” C go and, thus, 
lim Hn (s, t) = Hgo(s, t), 

which completes the proof. O 

Let us now return to the stepwise approximation of a filtration enlargement 


along a sequence of partitions of the trading interval by “initial enlargements,” 
and define their respective information increment. 


770 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


DEFINITION 5.4. Let A:0— 59 <- <S =T, n €N, be a partition of the 
interval [0, T] and let 4° be the information drift of ($2). The additional infor- 
mation of ($2) relative to (F;) is defined as 


n=] 


Ha = } Ago (Si, si+1)- 
i=0 ` 


THEOREM 5.5. We have 


T 
Jim, Ha = yE f u d(M, M),. 


PROOF. This follows from Theorem 4.4 and Lemma 5.3. O 


EXAMPLE 5.6. Let 9° = FÌ v o(P), where P is a finite partition in Fr. 
Then u? = u and by Lemma 5.3, 


T 
1 4 
H,o(0, T) = ix f u2 d(M, M),. 


If Fo is trivial, then 
Hao (0, T) = -2. P(A)log P(A), 
EP 


which is the absolute entropy of the partition P. Thus, the additional logarithmic 
utility of an agent with information (8,) is equal to the entropy of P. This example 
shows that there is a link between logarithmic utility and the so-called Shannon 
information. 


DEFINITION 5.7. Let X and Y be two random variables in some measurable 
spaces. The mutual information (or Shannon information) between X and Y is 
defined by 


I(X,Y) = #(Px,y|| Px @ Py). 


Now let Z be a third random variable. The conditional mutual information of X 
and Y given Z is defined by 


I (X, Y|Z) —- E[JC(Px,viz|| Pxiz & Py|z)]. 


provided the regular conditional probabilities exist. 
If A is a sub-c-algebra of F, then we write id4 for the measurable map 
($2, F) — (Q, A), œw «. For two sub-o -algebras A and D, we abbreviate 


I (A, D) = I (id, id). 


Since our probability space is standard, for any sub-o -fields A, D, € of F, there 
exists a regular conditional probability Pia, idgjidg and we define 


I (^, DIE) :— I (id, id lide). 


SHANNON INFORMATION AND ADDITIONAL UTILITY 771 


The mutual information was introduced by Shannon as a measure of informa- 
tion. It plays an important role in information theory (see, e.g., [16]). 


THEOREM 5.8. 


dim 2,16. Fa, Si+ IF) = m se | u2 d(M, M),. 


PROOF. Note that, for three random variables X, Y and Z, we have 


dP yjz — _ dFPxyyz) 
d(Pxiz ® Priz) dPxiz 
This property implies that one has, for 0 <s <t <T, 





d Pid lid, o 0 
1098, FIFO) = | E Pa, OPO 
id olid; 
P.C, do’) 
= [fs HA. dP (dP) 
F = Hgo (s, t). 


Thus, the assertion is an immediate consequence of Theorem 5.5. U 
This result motivates the following notion. 


DEFINITION 5.9. The information difference of (90) relative to (FP) up to 
time T is defined as 


AG, 2a lim pdt dos ule). 


REMARK 5.10. Note that we did not use M in our definition of the infor- 
mation difference of (89) relative to ($49). However, by Theorem 5.8, the infor- 


mation difference may be represented in terms of any local martingale satisfying 
the (PRP). 


Theorem 5.8 can be reformulated in the following way. 


THEOREM 5.11. The additional utility of an agent with information (Qt) is 
equal to the information difference of (89) relative to ($9), that is, 


ug(x) —ug (x) = A(90, $9). 


If (Qr) is initially enlarged by some random variable G, then the information 
difference of ($7) relative to (FÌ) coincides with the Shannon information be- 
tween G and (Fp). 


772 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


LEMMA 5.12. Let go = AL V o{G), where G is a random variable with val- 
ues in some Polish space. Then 


A(99, 29) = I (G, FRIFO). 
PROOF. Let 0 € s <t. By standard arguments, we have 7 (99, F, PIF, 0) — 
I (G, $8) and 
I(G, FPIF®) = 1(G, (Fe, £9)| 20) 
= I (G, £923) + 1(G, $225) 
(see, e.g., [16], Theorem 1.6.3). By iteration, we obtain, for all partitions ^, 
2, (82.29.1539) 2 1(G, $2193) 


and, hence, the result. [1] 


THEOREM 5.13. Let g9 = F? V o (G), where G is a random variable with 
values in some Polish space. Then the additional logarithmic utility of an agent 
with information (Q) is equal to the Shannon information between G and (Fp) 
conditioned on Fo, that is, 


ug (x) — up (x) = I(f$,G|$Q). 


In particular, if Fy is trivial, then the additional utility is equal to I (F: 0 G). 
PROOF. This follows from Lemma 5.12 and Theorem 5.8. O 


REMARK 5.14. If 9? = F? v o (G) and G is £2-measurable, then the mutual 
information 7 ($79, G|Fy) is equal to the conditional absolute entropy of G (see 
also [3]). 


EXAMPLE 5.15. Let (02, £, P) be the one-dimensional canonical Wiener 
space equipped with the Wiener process (Wi;)oz;«;. More precisely, 
$2 = G([0, 1], IR) is the set of continuous functions on [0, 1] starting in 0, F the 
o-algebra of Borel sets with respect to uniform convergence, P the Wiener mea- 
sure and W the coordinate process. (F;)o<;<1 is obtained by completing the natural 
filtration (£9. Suppose the price process S is of the form 


Sı = exp(W, + bt), 0ct«1, 


with b € R. We want to calculate the additional utility of an insider knowing 
whether the price exceeds a certain level or not. More precisely, we suppose the 
insider to know the value of 


G = 1(c,o9) (51), 


SHANNON INFORMATION AND ADDITIONAL UTILITY 773 


where c > 0 and Sf = maxo<;<1 Sr. By Remark 5.14, the additional utility is equal 
to the entropy 


H(G) = plog p + (1 — p)log(1 — p), 
where 
p = P(S; c). 


This may be calculated via Girsanov’s theorem. Namely, we have 


PST >o = p(y: c [0,1]: max W; + bt > logc) 
te[0,1] 


l b? \ |logc| | log c|? 
= bloge — —s|——— —> | ds. 
Jew dd 7°) as e( 2s i 


6. Mutual information estimates. In this final section we apply some re- 
sults from information theory to derive estimates for the information of a better 
informed agent. This yields a priori estimates for the agent's additional expected 
logarithmic utility in the light of the preceding section. Among other facts, the dif- 
ferential entropy maximizing property of Gaussian laws will play a role. We adopt 
the notation of [16]. 

Before we provide the information estimates, we summarize some basic facts 
of the mutual information (see [16], Theorem 1.6.3). For random variables X, Y, 
Z in some Borel spaces, the following properties hold: 


(L1) Z(X, Y|Z) > 0 and, 7(X, Y|Z) = 0 if and only if X and Y are independent 
given Z. 

(1.2) I(X, (Y, Z)) = IX, 2) + I(X, YZ). 

(L3) If X is a continuous random variable with finite differential entropy, then 


I(X,Y) =h(X) — h(XIY). 


For some fixed integer d € N, let X be a F? -measurable R?-valued random 
variable. Moreover, denote by Y a d-dimensional r.v. that is independent of the 
o -field FP . We consider the enlarged filtration go = FP V 6 (G), where G := 
A EI. 


LEMMA 6.1. Suppose that the law of Y is absolutely continuous with respect 
to Lebesgue measure and has finite differential entropy 
d Py 


d Py 
ies J 7-1 0)log 77 


(y) dy. 
Then 


(12) I(G, Fp) = h(X + Y) — h(Y). 


774 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


PROOF, Due to property (1.2), we have 
I(G, Fp) — I(X +Y, X) + I(X +Y, Fp |X). 
Given X, the rv.’s X + Y and ido are independent. Therefore, (1.1) and (1.3) 
lead to 


I(G, Fp) = I(X + Y, X) - h(X - Y) - (X -Y|X) S h(X +Y) - h(Y). O 


Now assume the perturbation Y to be a RZ-valued centered Gaussian r.v. that is 
independent of F$. 


LEMMA 6.2. Suppose that X € L*(P) and let Cx and Cy denote the covari- 
ance matrices of X and Y , respectively. Then 


1. det(Cy 4 Cy) 
1 VG $5 1952 —— 
99) (G, FT) S508 (Cy) 


Moreover, equality holds in (13) if X is Gaussian. 


PROOF. The distribution of Y is continuous with respect to Lebesgue measure 
and has finite entropy. Therefore, 


I(G, Fp) =h(X + Y) — h(Y). 
Let Cy and Cy denote the covariance matrices of X and Y, respectively. Due 
to the independence of X and Y, the random variable X + Y has the covariance 
matrix C y.,y = Cx + Cy. Next recall that the normal distribution maximizes the 
differential entropy under a covariance constraint, that is, A (X +Y) < A(Z), where 
Z is a centered Gaussian r.v. with covariance matrix Cyy. Therefore, 
I(X,X -- Y) x h(Z) — h(Y). 


Using the formula for the differential entropy of Gaussian measures (Theo- 
rem 1.8.1, [16]), we obtain 


A(Z) — h(Y) = | log(Ce)' det(Cx+y)) — ; log( (2x)! det(Cy)) 
O1, det(Cx+y) 


2 ? det(Cy) ` 
If X is Gaussian, then A(X + Y) = h(Z) and, hence, the second statement of the 
lemma follows. UU 


COROLLARY 6.3. Assume that additionally to the assumptions of the above 
lemma, the equation Y = kN is valid, where N is a d-dimensional standard nor- 
mal rv. and k > Q. Then 

pé A; +k 
I(G, £9) < = Y log 2——, 
( T) S 5 2. g " 
where à; (j —1,...,n) denote the eigenvalues of Cx. 


SHANNON INFORMATION AND ADDITIONAL UTILITY TIS 


PROOF. The proof follows easily by computing the determinants in Lem- 
ma6.2. Q 


The proof of Lemma 6.2 is based on the fact that Gaussian distributions maxi- 
mize the differential entropy under a constraint on the covariance structure. Let us 
recall the construction of entropy maximizing measures under a linear constraint. 


LEMMA 6.4. Let E C RÊ be a measurable set, c > 0 and g: E — [0, œ) a 
measurable map. Assume that there exist constants Z,t > 0, such that the measure 
v defined by 


a kad ass = —g 780) 
is a probability measure satisfying P. — c. Then v is the unique probabil- 
ity measure maximizing the differential entropy among all continuous probability 
measures u on E satisfying E" [g] — c. 


The entropy maximization problem is equivalent to minimizing the relative en- 
tropy #(- A2). Hence, the problem can be treated under more general constraints 
by using results of [9], Theorem 3.1. 


PROOF. Let u be a continuous probability measure on E with E“[g] = c. 
Then 


dy du dnt 
H = E” log — = E"] E" log — 
(uv) 08 7; og tE log- 


—tg 
= —h(u) — E" log > = —h(u) 4- h(»). 


Since A€(y||v) > 0 and H(gl|v) = 0 iff u = v, v is the unique maximizer of the 
differential entropy. C] 


REMARK 6.5. The above lemma can be used to derive similar results as 
obtained in Lemma 6.2. For instance, for E := R and g(x) := |x|, one ob- 
tains that the two-sided exponential distribution maximizes the differential en- 
tropy under the constraint E^g = c (c > 0). In particular, the measure v with 

& (x) = Qc) le- PV satisfies 


E"[g] -c and h(v) - 1 -log(2o). 
Now let X be a real-valued rv. in L! (P). Moreover, let xı := E[|X — EX|] and Y 
be a two-sided exponential distribution with E|Y| =: x2. Then due to Lemma 6.1, 


I(G, FR) < (95). 
2 


776 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


EXAMPLE 6.6. We consider the classical stock market model with one as- 
set. Let (FP) tejo,r] be a Brownian filtration generated by the Brownian motion 
(B;)te[o,7] and denote by (F;) its completion. The stock price is modeled by the 
process 


Sr = So exp( B, + bt}, 


where Sp > 0 is the deterministic stock price at time 0 and b € R. For some fixed 
times f1,..., ta € (0, T] (d € N), let X :— (By )i=1,....a. We suppose that the insider 
bases his investment on the filtration 9; —(,., Fs V o (G), where G = X --kN 
and N is a standard normal r.v. in Rf that is independent of Fr. Due to Lemma 6.2, 
the additional utility of the insider is related to the eigenvalues of the matrix 


Hp dy 2 t] 
i h f? 
fy tf ... fg 


Let us finish the section with an example for a general enlargement. 


EXAMPLE 6.7. We reconsider the classical stock market model of Exam- 
ple 6.6 with T :— 1. The knowledge of the insider at time ż is modeled by 
Gr = (poi Fr V O((Gs)sefo,rj), where Gi: c= By + B e(1—t)> (B,) is a Brownian 
motion independent of (B;) and g:[0, 1] — [0, co) is a decreasing function. We 
are therefore in a setting similar to Example 2.13. We now calculate the utility in- 
crement from the perspective of the notion of information difference of filtrations. 
Let zt be as in Section 4. For 0 < s <t < 1, we have 


7 (LO, 5) x (s, £) =1((Gu)ueto,s1 FPIFS) 
= I (Gs, £225) = I(Gs, BT) + (G5, $775, Br) 
= I (B1 + Bya—s), Br — Bs|F>)) 
= I (B; — B, + B-s), B; — Bs). 
Using the formula for the differential entropy for Gaussian measures, we obtain 


x ([0, 8) x (s, t]) = h(Bi — By, + B.) — h(B| — B, + B.) 


= 5 log(2re(1 —s+g(1—s))) 


I 
— 5 legne(! — t 4- g(1 — s))) 
i [—s+g(1—s) 


— 


=s Aga P s) 


— 


SHANNON INFORMATION AND ADDITIONAL UTILITY 777 


Alternatively, one can express 7 ([O, s) x (s, t]) as 


] ff 1 
z (10,5) x 1) = 5 f e T a 


For a partition A:0 = fp < ++- < tm = 1 (m € N), we consider D^ as in Section 4. 
One has 


n . 
n (D^) — n (Iti, ti) x (tial) 
iz 
] f! 1 
= ~ aaaea —— da. 
2Jo l—u--g(1-— max(t;:tj xu) 


Next, choose a sequence of refining partitions (A) such that their mesh tends to 0. 
Then the term in the latter integral is monotonically increasing in n and convergent. 
Hence, one obtains 
] fi 1 
lim z(D^») == [|| —————— du. 
pue yes eroi du) 
On the other hand, 


Jim z(D^") —z(D) —ug(x) — us (x). 


Consequently, the insider has finite utility if and only if hi GiS du < oo is 
finite. Now suppose g (y) = Cy? for some C > 0 and p > 0. It is straightforward 
to show that the integral and, hence, the additional utility, is finite if and only if 
p € (0, 1). This equivalence follows also from results in [8], where the authors 
compute explicitly the information drift. 


REFERENCES 


[1] AMENDINGER, J., BECHERER, D. and SCHWEIZER, M. (2003). A monetary value for initial 
information in portfolio optimization. Finance Stoch. 7 29-46. MR1954386 
ANKIRCHNER, S. and IMKELLER, P. (2005). Finite utility on financial markets with asym- 
metric information and structure properties of the price dynamics. Ann. Inst. H. Poincaré 
Probab. Statist. 41 479—503. MR2139030 
[3] AMENDINGER, J., IMKELLER, P. and SCHWEIZER, M. (1998). Additional logarithmic utility 
of an insider. Stochastic Process. Appl. 75 263—286. MR1632213 
[4] ANKIRCHNER, S. (2005). Information and semimartingales. Ph.D. thesis, Humboldt Univ., 
Berlin. . 
BAUDOIN, F. (2001). Conditioning of Brownian functionals and applications to the modelling 
of anticipations on a financial market. Ph.D. thesis, Univ. Pierre et Marie Curie. 
[6] BIAGINI, F. and OKSENDAL, B. (2005). A general stochastic calculus approach to insider 
trading. Appl. Math. Optim. 52 167-181. MR2157199 
[7] CAMPI, L. (2003). Some results on quadratic hedging with insider trading. Stochastics 77 327— 
348. MR2165112 
[8] CORCUERA, J., IMKELLER, P., KOHATSU-HIGA, A. and NUALART, D. (2003). Additional 
utility of insiders with imperfect dynamical information. Preprint. 


[2 


A 


[5 


i 


T18 S. ANKIRCHNER, S. DEREICH AND P. IMKELLER 


[9] CSISZAR, I. (1975). I-divergence geometry of probability distributions and minimization prob- 

lems. Ann. Probab. 3 146-158. MR0365798 

[10] DUFFIE, D. and HUANG, C. (1986). Multiperiod security markets with differential informa- 
tion: Martingales and resolution times. J. Math. Econom. 15 283—303. MR0871158 

[11] DELLACHERIE, C. and MEYER, P.-A. (1978). Probabilities and Potential. North-Holland, 
Amsterdam. MR0521810 

[12] DELBAEN, F. and SCHACHERMAYER, W. (1995). The existence of absolutely continuous local 
martingale measures. Ann, Appl. Probab. 5 926-045. MR1384360 

[13] ELSTRODT, J. (1996). Mafi- und Integrationstheorie. (Measure and Integration Theory). 
Springer, Berlin. 

[14] GRORUD, A. and PONTIER, M. (1998). Insider trading in a continuous time market model. 
Internat. J. Theoret. Appl. Finance 1 331—347. 

[15] GASBARRA, D. and VALKEILA, E. (2003). Initial enlargement: A Bayesian approach. 
Preprint. 

[16] IHARA, S. (1993). Information Theory for Continuous Systems. World Scientific, Singapore. 
MR1249933 

[17] IMKELLER, P. (1996). Enlargement of the Wiener filtration by an absolutely continuous ran- 
dom variable via Malliavin’s calculus. Probab. Theory Related Fields 106 105—135. 
MR1408418 

[18] IMKELLER, P. (2002). Random times at which insiders can have free lunches. Stochastics Sto- 
chastics Rep. 74 465-487. MR1940496 

[19] IMKELLER, P. (2001). Malliavin's calculus in insider models: Additional utility and free 
lunches. Math. Finance 13 153-169. MR1968102 

[20] IMKELLER, P., PONTIER, M. and WEISZ, F. (2001). Free lunch and arbitrage possibili- 
ties in a financial market model with an insider. Stochastic Process. Appl. 92 103-130. 
MR1815181 

[21] JEULIN, TH. and YOR, M., EDS. (1985). Grossissements de Filtrations: Exemples et Applica- 
tions. Springer, Berlin. MR0884713 

[22] PARTHASARATHY, K. R. (1977). Introduction to Probability and Measure. MacMillan Co. of 

l India Ltd. Delhi. MR0651013 

[23] PIKOVSKY, I. and KARATZAS, I. (1996). Anticipative portfolio optimization. Adv. in Appl. 
Probab. 28 1095-1122. MR1418248 

[24] REVUZ, D. and Yor, M. (1999). Continuous Martingales and Brownian Motion, 3rd ed. 
Springer, Berlin. MR1725357 


S. ANKIRCHNER S. DEREICH 

P. IMKELLER FACHBEREICH MATHEMATIK 
INSTITUT FÜR MATHEMATIK TECHNISCHE UNIVERSITAT BERLIN 
HUMBOLDT- UNIVERSITAT ZU BERLIN STRASSE DES 17. JUNI 136 

UNTER DEN LINDEN 6 10623 BERLIN 

10099 BERLIN GERMANY 

GERMANY E-MAIL: dereich @ math.tu-berlin.de 


E-MAIL: imkeller 2 mathematik.hu-berlin.de 


The Annals of Probability 

2006, Vol. 34, No. 2, 779-803 

DOE 10.1214/009117905000000701 

Q Institute of Mathematical Statistics, 2006 


A MICROSCOPIC MODEL FOR STEFAN'S MELTING AND 
FREEZING PROBLEM 


By CLAUDIO LANDIM AND GLAUCO VALLE 
IMPA and CNRS, and Ecole Polytechnique Fédérale de Lausanne 


We study a class of one-dimensional interacting particle systems with 
random boundaries as a microscopic model for Stefan's melting and freezing 
problem. We prove that under diffusive rescaling these particle systems ex- 
hibit a hydrodynamic behavior described by the solution of a Cauchy—Stefan 
problem. 


1. Introduction. In this work we return to the classical Stefan’s freezing on 
the ground model [8]. It could be described in the following way: Consider the 
real line occupied by a heat-conducting material (heat is transmitted only by con- 
duction). This material is initially almost everywhere characterized by a bounded 
and measurable temperature function T : IR — IR. According to the temperature the 
material could be in one of two phases, a liquid phase for positive temperatures and 
a solid phase for negative temperatures. The temperature T = 0 is that of crystal- 
lization at which both phases may occur. The problem consists in determining the 
temporal evolution of the temperature profile. 

We consider this problem under more restrictive conditions. Suppose that at 
initial time the liquid phase fills the domain u > O at positive temperatures 
and the solid phase fills the domain u < 0 at negative temperatures. Denote by 
po :R— — R- and pe :R+ — R, the initial temperature profile. We are able to 
determine a function B = B(t) describing the time evolution of the boundary be- 
tween the two phases and their temperature functions, respectively p_;(t, u) and 
pi(t, u) for the solid and liquid phases. It is well known that these functions satisfy 

a Cauchy-Stefan problem: 


0; 0| = A] uu p—1, 0;p1 = a1 duu P1, 
(1.1) B(t) = k(a19,01(t, B@) — aru pı (t, BO)}, BO) —0, 
pi(t,B(t)) 20, a0, )=p0, 


where pi ‘IR. — R and pi :R+ — R4 are bounded measurable functions, 
a_; > 0 and a, > 0 are the coefficients of heat conduction of the material with 
respect to the solid and liquid phases and k > 0 is a scaling factor for the tempera- 
ture. 


Received October 2004; revised April 2005. 
AMS 2000 subject classification. 60K35. 
Key words and phrases. Exclusion processes, Cauchy-Stefan problem, hydrodynamic limit. 


779 


780 C. LANDIM AND G. VALLE 


In this paper we present a microscopic model for Stefan's equation through an 
appropriated interacting particle system and scaling limit techniques. Such sorts of 
descriptions have been proposed previously by Chayes and Swindle [2] in the case 
of finite domains with a_; = 0. Rezakhanlou [6] and Bertini, Butta and Rüdiger [1] 
also derive Stefan's equation as hydrodynamic limit of interacting particle systems. 

We shall denote by Z, N and Z_ the sets of integers, positive integers and non- 
positive integers respectively, and by R— and R+ the sets of nonpositive and non- 
negative real numbers. 

For the informal description of the microscopic model, consider the one- 
dimensional Jattice Z with each site being occupied by a molecular agglomerate of 
type —1 for the material in the solid state and of type 1 for the material in the liq- 
uid state. According to its internal energy, each agglomerate is classified by a heat 
unit of O0 or 1. An interaction between neighboring sites occurs independently in 
the following way: If the particles are of the same type, then their heat units are 
interchanged after a mean a..; exponential time for particles type —1 and after 
a mean a; exponential time for particles of type 1. If the particles are of distinct 
type and their heat units are also distinct, at rate 1 the heat unit of the agglomer- 
ate whose heat value was 1 drops to 0 and simultaneously the other agglomerate 
changes type. If the particles are of distinct type and their heat units are equal to 1, 
both heat units drop to O after a mean 1 exponential time. Moreover, we start with 
configurations such that the agglomerates are of type —1 if they are at the left of 
the origin; otherwise they are of type 1. 

We will show that this system has a hydrodynamic behavior under diffusive scal- 
ing described by the solution of a Cauchy—Stefan problem of type (1.1) with scal- 
ing factor k = 1, where the temperature is the macroscopic heat density profile. The 
general case with an arbitrary k can be obtained from the previous one rescaling the 
temperature by k^. Here the diffusive scaling is expected since the hydrodynamic 
behavior of the simple symmetric exclusion process is described in this scale. Ac- 
tually, our model could be described as a coupling between two one-dimensional 
nearest-neighbor simple symmetric exclusion processes in the semi-infinite lattice. 
To make this identification, consider each agglomerate of solid (resp. liquid) phase 
as a site in the space of the Z (resp. N) in such a way that this association pre- 
serves the order. At each site whose associated agglomerate has heat unit equal 
to 1 we put a particle. In each of the lattices Z., N particles evolve as in a nearest- 
neighbor one-dimensional simple symmetric exclusion process with jump rates 
a—ı and a;, respectively. Superposed to this dynamics, a particle at the boundary 
of one of the lattices waits a mean 1 exponential time and attempts to leave the sys- 
tem. If no particle occupies the boundary of the other lattice, this particle vanishes 
triggering a translation of the whole system to the right or left depending whether 
the particle was occupying a site in Z.. or in N. If the boundary of the other lattice 
is occupied, then both particles leave the system. 

From the technical point of view, the main difficulty of this problem lies in the 
fact that no entropy argument can be used due to the annihilation mechanism. In- 
deed, if one fixes the boundary to be at the origin, the unique invariant measure is 


STEFAN'S MELTING AND FREEZING PROBLEM 781 


degenerate, and even when estimating the relative entropy with respect to a non- 
stationary state as in [3, 5], the translations introduce expressions too large to be 
estimated by the sole entropy. Coupling is therefore the unique available tool in 
this context. 


2. Main result. Let rT = ((—1,0), (-1, 1), (1,0), (1, 1)}4 be the configura- 
tion space and denote a typical configuration by (o, n) = {(o (x), n(x))]xez ET. 
Fix a_; > 0, a1 > O and define the generators $o, $1 by 


(Gof), n) =) Mo (x) =o@ + Didsa {for ™) — f (o, n), 


xeZz 
(Gi fd(o,n) = Y Vox) Zo (x + DHF (o, 9) — fo, m]. 
x€Z 
In these formulas, f stands for a cylinder function f :I' — IR, n*-” for the config- 
uration 7 with spins at x, y interchanged, 
n(x + 1), iz =i 
(2.1) j^" (z) = 4 n(x), ifz=x+1, 
n(z), otherwise, 


and T***!(¢, n) = (6, Ñ) for the configuration defined as follows: 


o (x), If z — x t- I, n(x - 1) 0, n(x) =I], 
o(z2-—419(x-1,  ifz-x,n(x)-0,n(x--1)-—1, 

a (z), otherwise, 
£ 0, ifz=x,x+1, 
n(z) = | 

n(Z), otherwise. 


Hence, two neighboring particles of different type annihilate each other at rate 1: 
(—1, 1), (1, 1) — (—1, 0), (1, 0); and a particle sitting next to a vacant site of dif- 
ferent type dies transforming the neighboring site into a site of its type at rate 1: 
(—1,0), (1, D) ^ (1,0), (1,0). 

Denote by (0z, 7;) the Markov process associated to the generator 3 = $0 + 81 
speeded up by N°. The goal of this article is to show that its macroscopic behavior 
is described by solutions of Stefan's equations. 


The Cauchy—Stefan problem. Let a(u) = a 11(u < 0} + a,1{u > 0), AG) = 
ua(u). Fix a bounded measurable function 0o:IR — R and consider the Stefan 
problem 


drp =a(p)Ap, 
(2.2) úi (t) = bi {0,A(p(t, ui(t)—)) — 9, A(p(t, ui(t)+))}, 
p(0, -) = po(). 


782 C. LANDIM AND G. VALLE 


Here A stands for the Laplacian, w;(t) for the curves at which p(t, uj (t)) = 0, 
bj = Ut, ui(t)—) < 0 < p(t; ui(t)-)) — Holt, ui(t)-) < 0 < pt, ui(t)—)) 
and the first equation should be understood throughout [0, T] x IR, except on the 
curves u;(t). 

Denote by Cj’*([0, T) x R) the set of functions G : [0, T) x R —> R with com- 
pact support which are continuously differentiable with respect to the first vari- 
able and twice continuously differentiable with respect to the second variable. 
A bounded measurable function p:[0, T] x R is said to be a weak solution of 
the Stefan problem (2.2) if for every function G € CQ'^([0. T) x R), 


+00 T 
J du | dt (A(p(t,u)) AGG, u) + olol, u))3,G(t, u)] 


(2.3) 
+00 
+f duo(oou))GO, u) =0, 
—00 

where 

p—l, p <Q, 
(2.4) w(p)= 1p, p > 0, 

—], p =Q. 


The proof of uniqueness of weak solutions for the Stefan equation (2.2) pre- 
sented in [7], Theorem 20, page 312, for boundary-valued problems can be easily 
adapted to our context. Furthermore, the generalized solution is continuous ac- 
cording to Theorem 21 of [7]. 


The initial states. Let A be the subset of I’ of all configurations (o, n) for 
which there exists x in Z such that 

—], ifz <x, 

o (z) = 

e | jM if z x. 

Note that A is stable under the dynamics induced by the generator 4%. 
For every (c, 7) € A, let 
b = b(o):—sup(z:o(z) = —1] 


be the boundary of the configuration (o, 7). 

In the proof of the hydrodynamic behavior of the process (o,7) we impose 
some conditions on the initial states. Let (m^, N > 1) be a sequence of probability 
measures on I'. We assume that: 


(H1) For every N > 1, m is concentrated on configurations of the set Æ such 
that b(c) = 0. 
(H2) There exists a bounded measurable function oo : IR — [—1, 1] such that 


oo —a 
J ndes 0, f aen 
a —OQ 


STEFAN'S MELTING AND FREEZING PROBLEM 783 


for all a > O and such that for each continuous function G:R — R with 
compact support and each 6 > 0, 
> | = 0. 


Notice that condition (H1) forces the initial profile to be negative on R— and 
positive on R4. | 
In the case a_; = 0, we impose one more condition: 


(H3) For every N > 1, 


lim m^ | 
Noo 








N^! Y GG/N)o (m) — | du GG) potu) 


x€Z 


milon € 4:9(x) 20,x x0) — 1. 


The hydrodynamic behavior. For each probability measure m on I’ concen- 
trated on A, denote by P N the probability measure on the path space D(IR.,, I’) 
induced by the Markov process (o;, nt) with generator 9, speeded up by N^ and 
initial measure m. 

Denote by M = M(R) the space of signed Radon measures on IR endowed with 
the vague topology. Integration of a function G with respect to a measure zr in M 
is denoted by (x, G}. To each configuration (5, o) € F we associate the empirical 
measure zt" — x (n, o) in M by assigning mass o (x) N^! to each particle: 


1 
x^ = oGnG)ósw. 


xeZ 


Let n? DL zx" (oy, Nt), b» = b(o;)/N. 


THEOREM 2.1. Fix a sequence of initial measures (m^ : N > 1} satisfying 
assumptions (H1), (H2) and (H3) if a_; = 0. For each t > 0, as N + oo, the em- 
pirical measure 1," converges in probability to an absolutely continuous measure 
x(t, du) = p(t,u) du, whose density p(t,u) is the weak solution of the Stefan 
problem (2.2): For every continuous function G :R — R with compact support 
and every 6 > 0 


Jim PA [ x. G) — f au Gwp, v) 








- 5| e 
Moreover, for every ô > 0 
lim P^" Tp(t) — B(r)| > 8] =O, 
Noo n 
where B is the solution of B(0) = 0, 
(2.5) B(t) — a-1(8,p)(t. B(t)—) — ai(8,p)(t, BO). 


784 C. LANDIM AND G. VALLE 


3. The fixed boundary model. Recall the definition of the set “A given in the 
previous section and of the boundary b = b(o, 9) of a configuration in A. To each 
configuration (c, 7) in A let E = &; , in Q = (0, 1} be the configuration viewed 
from the boundary: 

f(x) = n(x t b). 
It is not difficult to check that & is a Markov process with generator £ given by 
£ = a1 L1 + Lp-ta La. 


Here Lı and L» are the parts of the generator related to the motion of particles in 
a simple symmetric exclusion process on Z and N respectively: 


Li E » Ludi Lo = Nob 
xx0 xz 


where, for every local function f : Q — R and every integer x, 


(Le xii PE) = f(6***5) — fE), 


and £*-*t! is the configuration £ with spins at x, y interchanged defined in (2.1). 
In contrast, Lp is the part of the generator related to the dissipative feature of 
the system: For every local function f:Q— R 


(Ly PAE = €C) -EOOH Ff (116 — 00) — f£] 
+ &(O)[1 — ECD LE — 09) — FE} 
-E(Q)YE(ODCF (6 — 00 — 09) — fF}, 


where ọ stands for the configuration with no particles but one at x, and {tx : x € Z} 
for the group of translation so that (7,5) (z) = &(z + x) for all z in Z. 

Fix a bounded measurable function Ao : IR — R such that Ag(u) > 0, Ao(u) < 0 
for u > 0, u <0, respectively. A pair (A, D), where 4 is bounded measurable 
function A: [0, T] x IR — R strictly positive a.e. on (0, oo), strictly negative a.e. 
on (—oo, 0) and D:[0, T] — R is a bounded variation continuous function van- 
ishing at t = 0, is said to be a weak solution of 


OA = a AX + D(t)d,A, 
(3.1) D(t) = a-1(9,2) (t, 0—) — a1 (82) (t, 0+), 
A(0, ) es do(:), 
in the layer [0, T] x R4, if for every function G € Ca ((0; T) xR), 


Too T 
| du f dt {A(A(t, u)) AG(E, u) + ACE, u)ðG(t, u)} 
T -OO 
(3.2) = Í dt J (A(t, WGE, u) — Gt, 0) dD(0) 


+00 
+ J du do(u)G(O, u) — 0. 
—oo 


STEFAN’S MELTING AND FREEZING PROBLEM 785 


Notice that we are requiring the solution to be strictly positive, negative on (0, oo), 
(—oo, 0), respectively. 

Lemma 3.1 below shows that weak solutions of (2.2) may be obtained by a 
simple change of variables from weak solutions of (3.1). In particular, uniqueness 
of weak solutions of (3.1) follows from the uniqueness for (2.2). Indeed, assume 
that (A;, Dj) is a weak solution of (3.1). By Lemma 3.1 and by the uniqueness of 
weak solutions of (2.2), p(t, u) = A(t, u — Di) is the unique weak solution of (2.2). 
Since the weak solution of (2.2) is continuous and since A is a.e. strictly positive, 
negative on (0, oo), (—oo, 0), respectively, for each t > 0, o(t,-) vanishes at a 
unique point. This determines uniquely D, and therefore A. 


LEMMA 3.1. Let (A;, Dj) be a weak solution of (3.1) and let p(t,u) = 
A(t,u — D;). Then, p is a weak solution of (2.2). 


PROOF. Consider a weak solution (àz, D) of (3.1) and write D, as the differ- 
ence of two continuous increasing bounded functions: D; = D; — D7 . Let DH! 
be smooth uniform approximations of D* and set D? = Dy — D; ** so that 


(3.3) lin sup |D? — Dil =0. 


Fix a smooth function. G : [0, T] x R — R with compact support and vanishing 
at the boundary t = T. Let H° (t, u) = G(t,u-- Df). H* is a smooth function with 
compact support. Therefore, since (A;, D;) is a weak solution of (3.1) and since G 
vanishes at the boundary t = T, 


0-— (AT, G(T, u + D) = (Ar, Hy) 


T 
(3.4) fi Heyes i ds (is, (8 + a^) H?) 


T 
5 I MAs, 3u HEY — H8 (0) dD;. 


Recall the definition of the function @ given in (2.4) and that weak solutions 
of (3.1) are strictly positive in (0, oo) and strictly negative in (—oo, 0). The first 
two terms on the right-hand side of (3.4) can be rewritten as 


T T 
(oo), Bj) + | ds (As, aA BE) + | ds (w(As), 8, HE) 


T 
+ (1((—00, 0)}, HE) + Í ds (1{(—00, 0)], 8, Hê). 


Since G, and therefore H*, vanish at the boundary t = T, the second line of the 
previous expression is equal to 0. On the other hand, the first two terms of the first 
line are easily seen to converge to 


T 
(3.5) TIME: Í ds (A(ps), AGs) 


786 C. LANDIM AND G. VALLE 


as € | 0. The last term of the first line together with the last term on the second 
line of (3.4) is equal to 


T T 
[ (999. Osu D) ds + f (Os), 0,6). DD) aD? 


T 
= i (lAs, (9, H9) (s, u)) — HE()) dDs. 


The first term converges, as € | O, to 


T 
(3.6) Í ds (»(p,), 8G;), 


while, by definition of œ and H*, the sum of the second and third terms is equal to 


$ T 
-f G(s, Di) d(D$ — Ds) + | (As, 8, HE) d(Df — Dj). 


It is not difficult to show from (3.3) that this expression vanishes as e | 0. 
It follows from (3.4), (3.5) and (3.6) that 


T T 
(cop), Go) + Í ds (A(ps), AGs) + Í ds (cops), 8,G;) — 0, 


which concludes the proof of the lemma. O 


Hypotheses on the initial measures. Fix a sequence of probability measures 
(u^ : N > 1) on Q. To prove the hydrodynamic behavior of the system we will 
assume that 


(H1) There exists a bounded measurable initial profile Ag : IR — [—1, 1] such that 


OQ —4 
| das0. ist duco 
a 


—OQ 


for all a > 0 and such that for each 6 > O0 and each continuous function 
G:R — R with compact support 
- | =0, 


1 TW 
x 2, G6/N)1«0)6Q) — J =. 4uGUDADQU) 


xeZ 








jme" 
where 1..(u) = —1(u x 0) + 1(u 0). 
In the case a_; = 0, we impose one more condition: 
(H2) For every N > 1, 
uN (£ € Q:£(x) =0,x <0}=1. 


STEFAN'S MELTING AND FREEZING PROBLEM 787 


The hydrodynamic behavior. For each probability measure u on $2, denote 
by P the probability measure on the path space D(R.,., Q) induced by-the Markov 


process é; with generator £ speeded up by N^ starting from the initial measure u. 
Let DY (t) [resp. D" (r)] be the total number of particles on N (resp. Z_.) which 


left the system before time t divided by N and let D^ (£) = D (t) — D?! (t). For- 
mally D? (t) = N^ 3.1590) — & (21. 
THEOREM 3.2. Fix a sequence of initial measures iu, N > 1) satisfying 


(H1) and (H2) if a_; = 0. Then, for any t > 0, any continuous G:R — R with 
compact support and any ô > 0 
> | = 0, 


Jim Py[D"() — D()| 2 81 =0, 


1 
N 2 ,1«G/ N)GG/N)& (x) — | du G(u)A(t, u) 


xe 


N—oo 








where (X, D) is the unique weak solution of (3.1). 


4. Proof of Theorems 2.1 and 3.2. We first show how Theorem 2.1 can be 
recovered from Theorem 3.2. 


PROOF OF THEOREM 2.1. Notice first that the evolution of the original 
process (or, n) can be derived from the one of & since p = D (t) and n; = 
t b(o;)5t- 

By Theorem 3.2, for every t > 0, b converges in probability to D(t). This 
proves the second statement of the theorem since D(-) satisfies (2.5) in virtue 
of (3.1). 

On the other hand, if G : R — R is a continuous function with compact support, 
by the previous relations between & and (1, o), 


1 
(x, G) = ^ 3: 1«G/N)G(D" (t0) - x/N)& (x). 
xeZ 
By Theorem 3.2, this expression converges in probability to fdu G(D(t) + 
u)A(t, u) and this integral is equal to f du G(u)p(t, u) by Lemma 3.1. This con- 
cludes the proof of Theorem 2.1. O 


We turn now to the proof of Theorem 3.2. We present the proof in the case 
a_, > 0 which is more difficult. The same arguments apply to the case a_; = 0. 

Recall that we denote by M = M (IR) the space of signed Radon measures on R 
endowed with the vague topology and that we denote integration of a function G 
with respect to a measure m in M by (t, G}. For each N > 1 and each configura- 
tion £ of Q, let z be the empirical measure associate to £ given by 


l 
x" N 2 1.G)50xN. 


xeZ 


788 C. LANDIM AND G. VALLE 


Note the indicator function 14. (x) in the definition which corresponds to consid- 
ering particles on Z.. as having negative charge. Let m = zr (&) and recall that 
we are speeding up the process by N?. 

Recall that DY (t) [resp. D^ (t)] stands for the total number of particles on N 
(resp. Z_) which left the system before time t divided by N and that D^ (t) = 
D (r) — DY). 

With this notation, Theorem 3.2 states that the sequence Q™,, converges weakly 
to the probability measure concentrated on paths (zt, D(t)) whose first coordi- 
nate is absolutely continuous vr (t, du) = A(t, u) du, the density being the solution 
of (3.1) (cf. [4]). The proof consists in showing tightness, that all limit points are 
concentrated on absolutely continuous paths which are weak solutions of (3.1) and 
uniqueness of weak solutions of this equation. 

Uniqueness of weak solutions of (3.1) was discussed in the previous section, 
while tightness is proved at the end of this section. We show now that all limit 
points are concentrated on weak solutions. 

Fix a sequence uU" of probability measures on Q satisfying the assumptions of 
the theorem. Note that all limit points of the sequence Q = Q y are concentrated 
on absolutely continuous measures since in the limit the m-measure of a finite 
interval is bounded by its Lebesgue measure. 


PROPOSITION 4.1. All limit points Q* of the sequence Q^, are concentrated 
u 
on trajectories (7t;, Di) such that 


t t 
(m, G) — (19, G) = |, ds (ts, s +aA)Gs) — | 01.9465) — G(s, 0)} dD, 
for every t > 0 and G in Cy" ([0, T) x R). 


PROOF. The proof of this proposition is divided in several steps. We start 
examining some martingales associated to the empirical measure. Fix G € 
CÈ’ ([0, T) x R) and 6 > 0. Consider the martingale M° given by 


H 
MP" = P Ga) — ed Go) - | G + NLN, Gs) ds. 


An elementary computation shows that the quadratic variation (M: ), of this 
martingale is equal to the time integral of 


1 
wi 2, 4G/ NIE + 1) - £60 (V G)G/ ND) 
x40 
+ &(O)[1 — ECD) NG(-1/N) + (P, (Vn G)(x — 1/N))P 
+&(1)[1 — £(00)(GQ/N) — (x, VNG)” 
KE(E(O(G/N) — GOF. 


STEFAN'S MELTING AND FREEZING PROBLEM 789 


In particular, since G is a smooth function with compact support, by Chebyshev 
and Doob inequalities, 


Phy] sup IMP] > 8] s 48M MO" yr, 


0<t<T 


«ca, o? [T em f sm i anas | 


for some finite constant C(a, G) depending only on a(-) and G. Therefore, by 
Lemma 5.5, 


(4.2) lim P^ sup |ME^" |>3|=0. 
N ->00 0<t<T 


On the other hand, an elementary computation shows that for every smooth 
function H :R —> R with compact support, 


N? LIY, H) = (n, aAnB) 
+ N[E(1) — £()]Gc^ , Vw H) — NECH (2/N) 
(4.3) + N&(0)H(-1/N) + [a1£(1) + a-1&(0)] (Vv H)(0) 
+ &(0)(x™, AnH) 
— E(O£(D (P, AnH) — NT (Ay HY(4/N)), 


where Ay and Vy denote respectively the discrete Laplacian and gradient. There- 
fore, in view of Lemma 5.5, up to negligible terms, the martingale MSN can be 
written as 


(N , G) — (ax G) — [ ds (x ,8,G, + aAG;) 
| 
- f| ds NED - EON, BGs) = 66.0). 


By Lemma 4.3, provided we let & | 0 after N t oo, we may replace N[E,(O) — 
E,(1)] by €^! [D (s + e) — DP (s)]. Therefore, in view of (4.2), 


x^, G) — (n), G) 





lim lim sup P" | sup 
£79 N>% OxtzT 


tf 
> J ds (x, (8, 4-a^)G,) 


+f as DF (s +e) — DP (s) 
0 E 


x ((,8,G,) — G(s, 0} > J =0. 


790 C. LANDIM AND G. VALLE 


Hence, for any limit point Q* of the sequence Q“, 


lim o| sup 
£0 


O<1<T 





(m. G) - (no, G) — f "ds (ats, (8, +aA)Gs) 


i f d e Nd GO 0) > ] — 0. 
0 E 


By the proof of tightness of the second marginal of Q presented at the end of this 
section, Q* is concentrated on paths D which are of bounded variation. In particu- 
Jar, if d€ (s) = €! (D(s +£) — D(s)}, for any continuous function H :[0, T] > R, 
J ds d? (s) H (s) converges, as € | 0, to h H (s) d D(s). Since by Lemma 4.6 Q* is 
concentrated on paths x; which are continuous for the vague topology, the propo- 
sition is proved. [J 


The second main result of this section states that limit points of the sequence Q” 
are concentrated on trajectories x; whose density is bounded below by a strictly 
positive function. 


PROPOSITION 4.2. For each ô > 0, there exists a strictly positive continuous 


function Rs:R — (0, 1] with the following property. All limit points Q* of the 
sequence Q” are concentrated on trajectories (;, Dj) such that 


m) > J Rs(u) du 


for all 0 x t x T and all finite intervals I — [c, d] such that c > 6. A similar 
statement holds in (—oo, Q). 


The proof of this proposition is postponed to Section 5. 


LEMMA 4.3. Fix a smooth function G in Cy" ((0, T] x R) and 8 > 0. Then, 





Í ds LE 


lim lim sup P” | sup 
£V N—oo OxtzT 


. Di (ste) - D) 


: |. Go) 





> 5) =0. 


A similar statement holds if we replace &,(1), DW (s) by £ (0), D (s), respectively, 
or (x , Gs) by G(s, 0). 


PROOF. A simple computation shows that 


t 
MYO = DN (t) - N i £ (1) ds 


STEFAN'S MELTING AND FREEZING PROBLEM 791 


is a martingale with quadratic variation (M); given by {5 & (1) ds. We may there- 
fore write e^! (D' (s + &) — DY (s)) as 


M! (s +e) — M} (s) 
€ 


S+é 
+N J é (1) dr. 


The martingale part is easy to estimate because it vanishes in the limit N f oo. 
Indeed, by Chebyshev and Schwarz inequalities and py the explicit formula for the 
quadratic variation of the martingale M7 Not), 
QN | sup > ] 
" Lost<r 


< ELO pN T+e M C(G, T) y T+e 1/2 
< EN f ds|MYG)I | < - Elf ds&()| | 


Here and below C(G), C(G, T) are finite constants depending only on G and 
G, T, respectively. By Lemma 5.5, this expression vanishes as N ^ oo. 

It remains to consider the difference N&,(1) — N LER &, (1) dr, which is slightly 
more demanding. We first perform a time integration by parts to obtain that 


N MN 
[4,5829 al Ge 


€ 

















[ ds LO E vf & dre", G;) 





E +e 
< coly [ ds & 1) + vf as &al 





4 [ aswel. 6o -  [ ml ona : 





Lemma 5.6 permits to estimate the first term on the right-hand side. To estimate 
the second one, write the difference op. Gs) — (aN , Gy) as 


MON — MON 4 [ (8, + N^.) (x^, Gy) dv. 
r 
On the one hand, 


Py sup 


0<1<T 





fas N&Q- f dr {MON -Me^j - ] 
(4.4) Í B 


T 
E] sup [MON Í Né,(1) ds > 8/2]. 
E LosrsT+e 0 


Fix y > 0. By Lemma 5.5, there exists a finite constant A for which 


P^. [visas A] <y 


792 C. LANDIM AND G. VALLE 
for all N > 1. Therefore, the previous probability is less than or equal to 
y 22^] sup IMP > 3/24 |. 
à Oxt xT-re 
It follows from (4.2) that this expression is bounded by y as N t co. This proves 
that (4.4) vanishes in the limit N f oo. 
> ] 
vanishes as N ^ oo, € 40. 


It remains to show that 
H 1 $ S 
(4.5) P^ sup J ds NE,(1)— I dr J (8, + N2L) (x , Gy) dv 
H £ € J5—€ r 
By the explicit expression for (0, + N?.£) (1, Gs) given in (4.3), we have that 
the absolute value of the integral in this formula is dominated by 








OxrxT 


T f+ 
C, 6) | N& Q) ds s + sup NEO 5-5 14s]. 


Q0<t<T 


Repeating the argument presented just after (4.4) to eliminate In Né,(1) ds and 
applying Lemma 5.6 to estimate the second term, we show that (4.5) vanishes as 
N t co, € 4 0. This concludes the proof of the lemma. C 


We conclude this section by proving that the sequence of probability measures 
Q“ is tight, which in our context reduces to showing that the marginal of Q^ on 
each coordinate is tight. We start with the empirical measure. Denote by QY the 
marginal of Q on the first coordinate. 

Recall that Q is tight 1f for each smooth function with compact support 
G:R— R, o , G) is tight as a random sequence on D(IR,, IR). Now fix such 
a function. To prove tightness for (zt , G) it is enough to verify the following two 
conditions: 


(1) The finite-dimensional distributions of (aN , G} are tight. 
(ii) For every à > 0 


lim lim sup Ph | sup |(7;,G) — (ats, G)| >3|=0 


£0 N->00 |s~t|<e 


Condition (i) is a trivial consequence of the fact that the empirical measure has 
finite total mass on any compact interval. In order to prove condition (ii), consider 
the martingale with respect to F given by 


t 
MON = (xë , G) — n, G) — [ N? L(Y, G) ds. 


Here the index N indicates that we are considering the process speeded up by N?. 
Therefore, 


f 
(rN, G) — x", G) = MEY — MON 4 J N? L(Y , G) dr. 
AY 


STEFAN'S MELTING AND FREEZING PROBLEM 793 


From the previous expression, condition (ii) is a consequence of the next two lem- 
mas. 


LEMMA 4.4. For every 8 > 0 and every function G in C*(R) with compact 
Support, 





[ N° Liri, G) dr 





lim lim sup Pi | Sup > 5| = 0. 


&£—U N—oo |s—t|<e 


PROOF. In view of (4.3), since G is a smooth function with compact support, 
the expression inside the absolute value is bounded above by 


C(a, o): «f  N(5 0) + & aar] 


for some finite constant which depends on a, G only. To conclude the proof, it 
remains to recall Lemma 5.6. (] 


LEMMA 4.5. For every function G in C^ (IR) with compact support and every 
à 0, 


lim lim sup P^ sup Mo’ — MSN > ] =Q. 


£—0 N->00 | sop 


PROOF. Denote by (MCN), the quadratic variation of the martingale MË”. 
By the Doob inequality, 


P^ sup [MEN _ Mo") > 8| «PN, f sup IMP» 5/2] 
O<t<T | 


ls—1]1xe 


< EIU). 


By the explicit expression for (MC); given in (4.1), the previous expression is 
bounded by 
C(a, G) 
8^ 


To conclude the proof of the lemma, it remains to apply Lemma 5.5. ( 





[s EP IN (6. --& 14s ||. 


We turn now to the tightness of the second marginal of Q”. Since DY (0) = 
DN (0) = 0, we need only to show that 


lim sup lim sup Ply sup IDY (t)— DY (s)| > ] = 


£0 | N—oco |s~t|<e 


794 C. LANDIM AND G. VALLE 


for every 5 > 0 and a similar statement for D" (t) in place of DN (t). In fact, we 
claim that for every 5 > 0, there exists e > 0 such that 


(4.6) lim sup P j| Sup IDY (t) — DY (s)| > 5|= 0. 


N—oo |s—1|xe 
Since D (t) is increasing, the previous probability is bounded above by 


T£ 
2 P x [DA (U + Ue) — DY G6) > 8/2]. 


It follows from Lemma 5.4 of the next section that for each 6 > 0, there exists 
€ > 0 such that 


limsupPPA, [D (t +e) — D? (t) > 5] =0 
N>% 


uniformly in 0 < t < T. This proves that the second marginal of Q" is tight. 
We summarize in the next lemma what we just obtained. Notice that we proved 
tightness in the uniform topology. 


LEMMA 4.6. The sequence Q is tight in the uniform topology. In particular, 
all limit points are concentrated on continuous trajectories for the vague topology. 


PROOF OF THEOREM 3.2. By Lemma 4.6, the sequence is tight in the uni- 
form topology and by Propositions 4.1 and 4.2, all limit points are concentrated 
on weak solutions (A, D) of (3.1). By uniqueness of weak solutions, presented in 
Section 3, the theorem is proved. O 


5. Coupling. We prove in this section some important but technical results 
which are used in the previous sections. The proofs rely on a coupling between the 
process é; defined in Section 3 and an exclusion process 7, similar to & with the 
difference that the configuration is not translated when a single particle dies at 
the boundary. The generator L’ of this process is therefore a_;L_; +a L1 + L}, 
where L; is given by 


(LPE) = ECT — CONF E — 01) — FEN 
+ §(O)L — CDF — @0) — FO 
HEOI E — 00 — 91) — f (GJ. 


Notice that both marginal processes on Z.. and on N behave as an exclusion 
process with disappearance at the boundary, whose hydrodynamic behavior is well 
known. The leading idea of this section is to show, through appropriate couplings, 
that the original process does not differ much i in several aspects from the one de- 
fined above. 


STEFAN'S MELTING AND FREEZING PROBLEM 795 


Denote by (7; :t > 0) the Markov process with generator £’ speeded up by 
N? and recall that we denote by (£, :t > 0} the Markov process with generator £ 
speeded up by N 2 Let D; (ls, t]) be the total number of &-particles in N which 


died in the time interval [5, £]. Dg (Ls, £D, D; (is, t]) are defined analogously. 


LEMMA 5.1. There exists a coupling (5;, £j) for which 
(5.1) D; ([0. t}) + Dz (I0, t) < 2D; (I0, t) + 2D7 (I0, £]) 
for all t >Q. 


PROOF. The coupling between é; and 7, can be described as follows. Assume 
that the initial configurations are identical at time 0: £o = 5o. Label all particles 
and denote by X7 (resp. Y) the position at time t of the jth £- (resp. ¢-) particle. 
We assume that Xj « x if j <k, x <O< X}, Xj = Yo for all j. 

The £- and ¢-particles with the same label jump together preserving the order of 
the labels until a particle dies. If two &-particles die simultaneously, we couple the 
disappearance of the £- and ¢ -particles. If it is a single £-particle which disappears, 
assume, without loss of generality, that it has a positive label and denote this time 
by To. Note that due to the translation, X T = Yn, + 1 for all labels j associated to 
alive particles. 

Denote by T; the first time after Tọ in which the total number of disappearances 
of €-particles in N is equal to the total number of disappearances of &-particles 
in Z: | 


T, = inf(t > To: Dg ([To; t]) = Dg (To, 1]. 


In the time interval [70, 71], the coupling forces the £- and ¢-particles with the 
same label to jump together. It may happen, however, that a ¢ -particle on N disap- 
pears while its corresponding &-particle remains alive. In this case, the €-particle 
-becomes a second-class particle to allow the coupled particles to jump together. 
The same phenomenon may occur on Z—, where a €-particle may disappear while 
its corresponding ¢ -particle remains alive. In this case also the ¢-particle becomes 
a second-class particle. Notice, in particular, that the difference X7 — Y7 does not 
depend on j for coupled particles. 

Due to the translations, for any Tọ < t < 7), the total number of é-particles 
which died on N is bounded by the total number of ¢-particles which died on N: 


D; ([To, t]) < DF (LTo, t. 


| Also due to our definition of 71, for any 79 < t < Ti, the total number of £-particles 
which died on Z.. is bounded by the total number of £-particles which died on N: 


D; ([To. t) < Dy (T0. t]). 
Therefore, (5.1) holds for any 0 x t < T1. 


796 C. LANDIM AND G. VALLE 


To proceed by iteration, let Kı = D; (To, T,]) be the total number of 


£-particles which died in N in the time interval [7p, 71], and let kj = De (To, 711) 
be the total number of £-particles which died in N in the time interval [7p, T1]. 
By definition of the coupling, at time 71, there are Kı — kı uncoupled &-particles 
on N and K; — kı £-particles which died on N associated to the Kı — kı un- 
coupled &-particles. These Kı — kı £-particles should not be forgotten, since 
they will be used to compensate the eventual death of the second-class uncou- 
pled &-particles. The important fact for the recurrence argument is that the number 
of dead ¢-particles which were not used to compensate the death of &-particles is 
at least equal to the number of second-class uncoupled £-particles. Notice also that 
there might be second-class uncoupled ¢-particles on Z_. Since they do not play 
any role in the argument, we do not refer to them again. 

Assume that the first single 5-particle to die after T1 is in Z_. If it were in N, 
we could repeat the arguments presented in the last paragraph to arrive at the same 
conclusions obtained there and iterate again the argument. 

Denote by 7» the first time after 7; in which the total number of disappearances 
of &-particles in N is equal to the total number of disappearances in Z.: 


T) = inf(t > Ti: Dy (I1, tD) = Dy (Ii. £D). 


The coupling in [71, 72] is the same described before, in which coupled particles | 
jump together until one of them dies. For T1 < t < Th, let L5(t) = D. (T, tl), 


19.22 Lah) bhl) = D, ([73, 1), £2 = £2(T5). By definition of the coupling 
Dj (ni, t]) x £2(t) < La(t), so that (5.1) holds for 0 < t < 7. 
On the other hand, at time 75, there are: 


(a) at most Kı — kı uncoupled £-particles on N. There might be less since a 
second-class £-particle might have died, its death being compensated by the death 
of a f-particle on Z_, 

(b) L5 — £2 uncoupled &-particles on Z_, 

(c) Kı —kı ¢-particles which died on N (in the time interval [7o, 71]) and which 
are associated to the remaining uncoupled £-particles on N, 

(d) L5 — £2 £-particles which died on Z- associated to the remaining uncou- 
pled &-particles on Z_. 


Thus, at time 75, the total number of uncoupled second-class -particles is still 
smaller than the total number of dead and disassociated £ particles. 

Assume, without loss of generality, that the first single particle to die after T» 1s 
in N. Denote by 73 the first time after 75 in which the total number of disappear- 
ances of £-particles on N is equal to the total number of disappearances in Z. : 


T5 = inf(t > D: D? ([72, t1) = D; ([75, 1). 


The coupling remains the same. The next argument, though elementary, requires 
much notation. For 75 < t < Th, let: 


STEFAN'S MELTING AND FREEZING PROBLEM 797 


(a) j3(t) be the total number of second-class &-particles which die on N in the 
time interval [75, t], 

(b) k3(t) — j3(t) be the total number of first-class &-particles which die on N in 
the time interval [75, t], 

(c) K3(t) — j3(t) be the total number of ¢-particles which die on N in the time 
interval [75, t], 

(d) £3(t) be the total number of £-particles which die on Z_. in the time interval 
[75, t]. 


By the definition of the coupling, £3(t) < ka(t) < K3(¢). The j3(1) second-class 
£-particles which died on N were associated to ¢-particles which died before. 
Since there is a factor 2 in (5.1), we also associate to these ¢-particles, j3(t) ^ £3(t) 
£-particles which died on Z_. The k3(t) — j3(t) first-class €-particles which died 
on N are taken care of by k3(t) — j3(t) ¢-particles which died on N. The factor 2 
in (5.1) allows to include (£3(t) — j3(t))* x k3(t) — j3(t) &-particles which died 
on Z. Up to this point we showed that all disappearances of &-particles in [75, t] 
can be compensated by disappearance of ¢-particles in [75, t] and by disassoci- 
ated ¢-particles which died before T5. Therefore, (5.1) holds in the time interval 
[75, T3]. 

To be able to iterate this argument notice that there are K3(73) — k3(73) second- 
class -particles created on N in the time interval [75, 73] and K3(73) — k3(73) 
¢ -particles which died in this interval and whose deaths were not used to compen- 
sate € deaths. We may therefore associate these new second-class &-particles to 
these newly dead £ particles and iterate the argument. This concludes the proof of 
thelemma. Ll 


Denote by £; , ¢;* the marginals of the process 7; on Z_, N, respectively. Notice 
that both marginals evolve as an exclusion process in which particles leave the 
system at the boundary. This system plays an important role in the sequel and 
deserves a notation. For b > 0, denote by 8; the Markov process on (0, 1}4+ with 
generator £ = &p given by 


(€f)(8) =b 9 CF (B***") — f(8)) + BOF (B — 09 — Ff (B)}- 
x20 
For T > 0 and a measure u on (0, 1)7*, denote by p the probability on 


D([0, T], (0, 1}4+) induced by the Markov process f; speeded up by N? starting 
from u. Expectation with respect to PY i is denoted by EN 


It is well known that the process f; has e hydrodynamic description. Let x" be 
the empirical measure associated to f, : mi = NT DAT Pr. 


PROPOSITION 5.2. Consider a sequence of probability measures jj" 
(0, 1}4+ such that 


Jim. a" la”, G)~ f oG du 





>5|=0 


798 C. LANDIM AND G. VALLE 


for some measurable function po:R4 — [0, 1], every 8 > 0 and every continuous 
function with compact support G. Then, 


dim Bo 








x", G) - f ot. wow du zl 20 


for every 5 > 0 and continuous function with compact support G, where p is the 
solution of the linear equation 


040 = bAp on Ri, 
ptt, 0) = 0, pO, :) = po(-). 


(5.2) 


The proof of this result is similar to the one of Theorem 2.1 in [5]. Moreover, 
the solution of (5.2) can be represented in terms of a standard Brownian motion 
W, with absorption at the boundary u = 0: 


(5.3) p(t, u) = E,[po(VbW,)]. 


The coupling presented in Lemma 5.1 together with the hydrodynamic behavior 
stated in Proposition 5.2 permit to estimate the total number of particles which left 
the system in the original process &;. This is the content of the next two lemmas. 
Recall the definition of D^ (t), DN (t) introduced in Section 4. Denote by 1 the 
probability measure on (0, 1}“+ concentrated on the configuration with all sites 
occupied and by Dg(t) the total number of B-particles which left the system before 
time /. 


LEMMA 5.3. There exists a finite constant Co depending only on aj, a..1 such 
that 


lim sup E, [D] ()] < Covt 


for all t > 0 and all probability measures u. The lemma remains in force if DY 4 is 
replaced by DN. 


PROOF. By Lemma 5.1, the expectation in the statement of the lemma 
is bounded above by 2EÀ [Dr(t)], where D;(f) stands for the total num- 
ber of particles which left the system in the time interval [0,1] for the ¢ 
process. Since both marginals of ¢ evolve as an exclusion process with dis- 
appearance at the boundary, the previous expectation is bounded above by 
4 maXb—a;,a. EN NEDY (t)]. By monotonicity, this latter expectation is less than or 


equal to 4 maxp—a, a. EA YDY (¢)]. By the hydrodynamic limit of £, this expecta- 
tion converges, as N t oo, to 


4 max [a — pp(t, u)} du, 


b==a),a-} 


STEFAN'S MELTING AND FREEZING PROBLEM 799 


where py is the solution of (5.2) with initial condition pọ constant equal to 1. With 
this initial condition, the solution of this equation can be written as pp(t, u) = 
1 — 2P[B; > u//2b], where B, is a standard Brownian motion. In particular, the 
previous displayed equation is equal to 


4./2t max [ /a-3, va ) E[| Bi []. 


This concludes the proof of the lemma. 1] 


LEMMA 5.4. For every ô > 0, T > 0, there exists € > 0 such that 


lim sup? wD‘ (t +e) — DË (t) -ó]-— 
N—oo 


uniformly for 0 < t < T. The statement remains in force if we replace DN by DN. 


PROOF. Fix 0 <t - T. Denote by w(t) the state of the process at time f. 
With this notation the probability appearing in the statement can be written as 


Py o9 E D (£) > 8]. 


By Lemma 5.1 and by attractiveness of the B-process, the previous expression is 
less than or equal to 


2 max B 1 [D$ (e) > 8/4]. 


b—a,,a 


By the hydrodynamic limit of 6 and the proof of the previous lemma, DY g (€) 
converges in probability to 


IN (1 — pple, u)} du = V2be E[]Bil]. 


In particular, if e is chosen small enough for the last expression to be less than 5/4, 
the previous probability vanishes as N f oo. This concludes the proof of the 
lemma. iL] 


LEMMA 3.5. For every t > 0, 


N ft 
up EM, f NEO + &(1)}ds| < oo. 


PROOF. Recall the definition of the martingale M M (t) introduced in the begin- 
ning of the proof of Lemma 4.3. In particular, EP Lo NE (1) ds] = EP [DY (t)]. 
It remains to recall Lemma 5.3 to estimate the expectation appearing in the state- 


ment of the lemma for large N. For small N, it is enough to bound DN (t) by 
a Poisson point process. [C] 


800 C. LANDIM AND G. VALLE 


LEMMA 5.6. For every T >Q and 6 > 0, 


lim lim sup Pw F sup ü ^ N(O) --5()ds > 8| = 0. 


E>0 N..oo O<1<T 


PROOF. Recall the definition of the martingale M iM (t) introduced in the be- 
ginning of the proof of Lemma 4.3. With this notation, in order to prove the lemma 
we need to show that 


lim lim sup PY Jl sup (MJ (t 4-£2)— MY (t)} > ] = 0, 
ut O<t<T 


£0 N> 


lim lim sup P^ sup (DX (t +é)— D? (t)} > ] = 0, 
£V N—oo OxrxT 
and a similar statement with M^ (t), D (£) in place of MN (t), D! (2). 
The martingale part is easy. By Doob's inequality and the explicit formula for 
the quadratic variation of M M (f) given at the beginning of the proof of Lemma 4.3, 
the probability which needs to be estimated 1s less than or equal to 


4 [BE 
P^ sup MYG > 4/2) < -—EN n 4s . 
pN xc (t)| / §2 u” 0 &s( ) 
By the previous lemma, this expression vanishes as N ¢ oo for any ¢ > 0, ô > 0. 

On the other hand, the jump part has been estimated just after the proof of 
Lemma 4.5. Lj] 


We conclude this section with the proof of Proposition 4.2 which relies on the 
following lemma. For a subset 7 of R, denote by 1(7) the indicator function of J. 


LEMMA 5.7. Fix T > 0 and let uw be a probability measure satisfying as- 
sumption (H1). There exist a finite constant Ao and a strictly positive continuous 
function Ry : (—60, —Ag] U [Ao, c0) > Ry such that 


b 
lim sup Ply lon", a, bD) < J Rr(u)du| = 
N —oo a 


for all 0 <t <T and all intervals [a, b] such that a > Ao or b < — Ap. 


PROOF. Fix T -0,0-:-7T and let |DP|(r) = DY (t) + DP (t) be the to- 
tal number of particles which left the system before time ¢ divided by N. By ` 
Lemma 5.4, there exists A; > 0 such that 


limsupP,, [DY IT) > A1] = 


Fix such A; > 0 and couple the € process with a ¢ process as in Lemma 5.1 with 
the additional property that ¢-particles which jump to the interval [—A1N,..., 


STEFAN'S MELTING AND FREEZING PROBLEM 801 


A1N) are removed. In particular, all -particles initially in this interval become 
instantaneously second-class particles. 

On the set |D" (T) < A1, we have that a particle Y/ is alive only if the particle 
XJ is alive and that the distance |X/ — Y7 |, which does not depend on j for alive 
particles, is bounded by N | D | (T). Therefore, if we fix an interval J = [a,b] with 
a 3A, or b < —3A,,0n the set |D"|(T) < Aj, 


Gr" (&),10)) -N7! Y^ &G)-N7 Y AX] e NI), 
J 


x/Nel 


where the last summation is performed over all indices j corresponding to alive 
particles, and is bounded below by 


inf NOS 1(Y? e NG - D inf (ce), 1v 4- 1). 
Ty" 2 5 » 2 rond Co H 


In these formulas, v + I = {u +v:u € I} and NJ ={Nu:u E€ J}. 

Since infiyj<A; Cr, H{v + J}) is a continuous function for the vague topology 
because all measures have density bounded by 1, by the hydrodynamic limit for ¢, 
the last expression converges in probability to 


b 
inf (oj, Hvo 4-1) > | inf p(t,v)du, 
[vj <A] a |v—ul<A) 


where p is the solution of the linear equation (5.2) with initial condition pọ and 
boundary condition p(t, +41) = 0. By the explicit formula (5.3) for the solution 
of (5.2), p(t, u) is smooth. Moreover, since po satisfies condition (H1), p(t, u) is 
strictly positive. Therefore, for each u > 3A}, 

R(T,u) — jin inf  o(t,v) - O0. 


<t<T |v—u| <A] 


This concludes the proof of the lemma. O 


PROOF OF PROPOSITION 4.2. In Lemma 5.7 we proved the proposition for 
intervals far from the origin. We estimate now the density on intervals close to the 
origin repeating the same argument presented in the proof of Lemma 5.7 and using 
the fact that the total number of particles which leave the system in a small time 
interval cannot be too large. 

Fix T > 0 and recall from (4.6) that for each ô > 0 there exists € = e(ê) > 0 
such that 


limsup Piy | sup [D ((r -- e) IDNA > J =0. 
N-—oo 0<1t<T 


We shall estimate without loss of generality the density on R+. Fix a > 0. Let 
5 =a/3, s = t — &(8) and denote by u” (s) the state of the £ process at time s. By 


802 C. LANDIM AND G. VALLE 


Lemma 5.7, at time s, the particles’ density is bounded below by a strictly positive 
function Rr: 


lim sup u" G)[Gr^, 10]) < RrQ)] = limsup Phy [Ors 10]) < RrQ)] =0 
N-—oo N-—oco 


for all intervals J = [c, d] with c > Ao. Here, Rr (I) = f; Rr(u) du. 

Starting from uu" (s), we couple the £ process with a ¢ process as in the 
proof of Lemma 5.1 with the additional feature that ¢-particles which reach 
the set {1,..., ôN} are killed. Following the argument presented in the proof of 
Lemma 5.7, on the set | D" |(z) — |DN (s) < 8 we obtain that 


(1^ G0, HJI) = inf (eG), Mv + 7), 


for every interval J = [c, d] with c > a. The asymptotic behavior of the right-hand 
side of this inequality is given by the hydrodynamic limit of the ¢ process in the 
time interval [s, t]. The fact that we do not know the law of large number for the 
empirical measure at time 5 is not a problem. In fact, it is not difficult to show 
that x (¢,) is a tight sequence and that all limit points are concentrated on weak 
solutions of (5.2) with boundary condition p(r, ô) = 0 for s <r x t and on tra- 
jectories x which at time s are bounded below by Rr on the interval [Ao, oo). By 
monotonicity of weak solutions of (5.2), at time ¢ the empirical measure has a den- 
sity bounded below by the solution of (5.2) with initial condition Rr1([Ao, oo)]. 
Therefore, in the limit N ¢ oo, the right-hand side of the previous equation is 
bounded below by 


d 

| inf p(e,v)du, 
c lv—-u|xó 

where p is the solution of (5.2) with initial condition po = Rr1([Ao, oo)] and 

boundary condition p(r, 6) = 0 for 0 <r < e. Here again, the explicit formula (5.3) 

for the solution of (5.2) shows that the continuous function infj, «s p (€, v) is 

strictly positive on [a, oo). This concludes the proof of the proposition. LJ 


REFERENCES 


[1] BERTINI, L., BUTTÀ, P. and RÜDIGER, B. (2000). Interface dynamics and Stefan problem 
from microscopic conservative model. Rend. Mat. Appl. 19 547—581. MR1789487 

[2] CHAYES, L. and SWINDLE, G. (1996). Hydrodynamic limits for one-dimensional particle sys- 
tems with moving boundaries. Ann. Probab. 24 559-598. MR1404521 - 

[3] VALLE, G. (2003). Evolution of the interfaces in a two dimensional Potts model. Preprint. 

[4] KIPNIS, C. and LANDIM, C. (1997). Scaling Limits of Interacting Particle Systems. Springer, 
Berlin. MR1707314 

[5] LANDIM, C., OLLA, S. and VOLCHAN, S. (1998). Driven tracer particle and Einstein rela- 
tion in one dimensional symmetric simple exclusion process. Comm. Math. Phys. 192 
287—307. MR1617558 

[6] REZAKHANLOU, F. (1990). Hydrodynamic limit for a system with finite range interactions. 
Comm. Math. Phys. 129 445—480. MR1051500 


STEFAN'S MELTING AND FREEZING PROBLEM 803 


[7] RUBINSTEIN, L. I. (1971). The Stefan Problem. Amer. Math. Soc., Providence, RI. 


MR0351348 


[8] STEFAN, J. (1889). Über einige Probleme der Theorie der Warmeleitung. $-B Wien Akad. Mat. 


Natur. 98 173—484. 


IMPA 

ESTRADA DONA CASTORINA 110 
CEP 22460 RIO DE JANEIRO 
BRASIL 

AND 

CNRS UMR 6085 

UNIVERSITÉ DE ROUEN 

76128 MONT SAINT AIGNAN 
FRANCE 

E-MAIL: landim Gimpa.br 


ECOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE 
LAUSANNE 

SUISSE 

E-MAIL: glauco.valle Q epfl.ch 


The Annals of Probability 

2006, Voi. 34, No. 2, 804-819 

DOI: 10.1214/009117905000000675 

© Institute of Mathematical Statistics, 2006 


RIFFLE SHUFFLES OF DECKS WITH REPEATED CARDS 


By MARK CONGER AND D. VISWANATH! 
University of Michigan 


By a well-known result of Bayer and Diaconis, the maximum entropy 
model of the common riffle shuffle implies that the number of riffle shuf- 
fles necessary to mix a standard deck of 52 cards is either 7 or 11—-with the 
former number applying when the metric used to define mixing is the total 
variation distance and the latter when it is the separation distance. This and 
other related results assume all 52 cards in the deck to be distinct and require 
all 52! permutations of the deck to be almost equally likely for the deck to 
be considered well mixed. In many instances, not all cards in the deck are 
distinct and only the sets of cards dealt out to players, and not the order in 
which they are dealt out to each player, needs to be random. We derive tran- 
sition probabilities under riffle shuffles between decks with repeated cards to 
cover some instances of the type just described. We focus on decks with cards 
all of which are labeled either 1 or 2 and describe the consequences of hav- 
ing a symmetric starting deck of the form 1,...,1,2,...,20r 1,2,..., 1,2. 
Finally, we consider mixing times for common card games. 


1. Introduction. The connection between examples and concepts in proba- 
bility theory is a particularly close one. That examples derived from the question 
"How many shuffles mix a deck of cards?" have featured prominently in the devel- 
opment of the convergence theory for Markov chains by Persi Diaconis and others 
can be seen in this light. This article deals with riffle shuffling, which is the most 
common way of shuffling cards. 

There are 2" ways to cut a deck of n cards into two packets and then riffle them 
together since a card that ends up in the ith position can be dropped by either the 
left hand or the right hand. The maximum entropy model assigns equal probability 
to all these 2" riffle shuffles. More generally, the maximum entropy model assigns 
equal probability to all a" a-shuffles, with an a-shuffle being a way to cut a deck 
into a packets and then riffle them together. Several equivalent descriptions of the 
a-shuffle have been given by Bayer and Diaconis [2]. The a-shuffle with a = 2 is 
also described by Epstein [5] who calls it the amateur shuffle. 

We will refer to elements of the group $, of permutations of {1,2,...,”} as 
shuffles. If zt € S, and x (i) = j, then by convention the shuffle 2 sends the card 
in the ith position to the jth position. The number of descents of zr is defined as the 


Received October 2003; revised January 2005. 


1 Supported in part by a fellowship from the Sloan Foundation. 
AMS 2000 subject classifications. 60CO5, 60320. 
Key words and phrases. Descents, multisets, riffle shuffles. 


804 


SHUFFLING DECKS WITH REPEATED CARDS 805 


number of positions 1 x i < n at which zr (i) > zt (i + 1). Bayer and Diaconis [2] 
proved that the probability that an a-shuffle results in a shuffle x with d descents 


is given by i^ Tm "a i 1), The picture below shows a 3-shuffle of 6 cards. 


The bottom line indicates that the Oth, 1st and 2nd packets in the cut have 3, 2 
and 1 cards, respectively. The top line indicates that the 1st, 2nd, 3rd, 4th, 5th and 
6th cards in the shuffled deck are dropped from the 2nd, Oth, 1st, 1st, Oth and Oth 
packets, respectively. If the numbers are ignored, the arrows alone depict a shuffle. 
In a depiction of a shuffle such as the one above, a descent corresponds to a cross- 
ing between arrows that originate at adjacent positions. The shuffle depicted above 
has 2 descents, and therefore, according to Bayer and Diaconis [2], the probability 
that an a-shuffle results in the shuffle depicted above is X CP). 

In nearly all of the literature on card shuffling, the cards in a deck are assumed to 
be distinct. We allow cards to be indistinguishable. In our notation, both 1, 1, 2, 1 
and 17, 2, 1 denote the deck with two cards labeled 1 above a card labeled 2 above 
a card labeled 1. Let a1, a2,..., a, be a deck. When it is shuffled using x € Sp, the 
deck obtained is s M1)» Ag -1Qy +++ 7-1 (n) We define z(D;; D>) as the set of 
shuffles x € S, such that x applied to Dı results in Do. The descent polynomial 
of the shuffles from the starting deck D; to the ending deck D» is defined as 


> y de50O 


nen (Dy; D2) 
where des(zr ) is the number of descents in zr. For example, the descent polynomial 
of shuffles from 1, 1, 2,2 to 1, 2,2, 1 is 2x + 2x?. 
If the descent polynomial of the shuffles from a deck D; with n cards to a deck 
D; is 5x cax%, the probability that an a-shuffle of Dj results in the deck D» is 


ais jp [ore 


n n 
d=0 E 


a formula obtained by summing over all the shuffles in z: (Dj; D2). The system of 
linear equations (1.1) can be inverted to obtain 


Cd = pai (d + 1)" — pad" (" B ) 


(1.2) 
tpa- D" (^5!) npn (^7). 


806 M. CONGER AND D. VISWANATH 


for 0 < d « n. It is possible to pass back and forth between the transition probabil- 
ities p, and the descent polynomial of zz (Dj; D2) using (1.1) and (1.2). 

In Section 2 we deduce efficient recursions for the descent polynomial from the 
starting deck D to the ending deck Dy when either D; or D» is a sorted deck 
of the form 1"!,272,..., K"*, We derive a formula for the transition probabilities 
when D; = 1,2, ..., k, x" without using the descent polynomial. Sections 3 and 4 
consider the starting decks D; = 1", 2" and D, = (1, 2)". Section 5 summarizes 
mixing times for card games obtained using results in the preceding sections and 
Monte Carlo simulations. 

Although decks with repeated cards do not seem to have been considered, the 
work of Diaconis, McGrath and Pitman [4], Fulman [7] and Lalley [11] on cycle 
decompositions, and of Fulman [8] on increasing subsequences are in a somewhat 
similar vein. The thesis of Reyes [13] has new results, as well as many references 
related to other types of shuffles. 


2. Transition probabilities. We begin with a recursive algorithm to obtain 
the descent polynomial of shuffles from 1"!, 272, ,.., A"^ to a deck D which has 
the same n; + n2 +--+ nj cards but in a different order. Each of the numbers 
n1,^2,...,Aj is a positive integer. The transition probability under an a-shuffle 
can be obtained using the descent polynomial and (1.1). 

We assume the starting deck to be 1"!, 272, , "^, which is in sorted order. 
We denote by D(i, c) the position of the ith card labeled c in D. For example, 
if D —1,2,1,1,2,2, then D(2, 1) = 3. The deck obtained from D by keeping 
the cards labeled 1,2, ..., e and by discarding cards with all other labels will be 
denoted Die. Similarly, the deck obtained from D by keeping the cards with labels 
f. ...,h and discarding other cards will be denoted by D sp. We assume 1 <---< 
e< f <---<h and that there is no card whose label is in-between e and f, or, 
equivalently, f =e + 1. Let N —n4--na-4- ny, Ni =ni no: né and 
Nz = N — Ny. Then N, N; and N3 equal the number of cards in D, Die and D fh, 
respectively. 

Consider the set of all shuffles z from the sorted deck 17!,2”2,...,h"* to D 
such that z: (1) = D(i, 1) and z (N) = D(j, h), where 1 <i <n, and 1 x j € n. 


The number of these shuffles with d descents is set equal to the coefficient of x7 


to define the polynomial pj, ; (x). 

To obtain a recursion for pj; (x), consider the set of shuffles from the sorted 
deck 1"!,272....,e"e to Di, and the set of shuffles from the sorted deck 
f?f,..., h^ to D fn. Define qi k(x), for 1 xi <n; and 1 <k < ne, as the polyno- 
mial in which the coefficient of x? equals the number of shuffles x with d descents 
belonging to the first set which satisfy z' (1) = Die(i, 1) and zt (N1) = Die(k, e). 
The polynomial r; ;(x), for 1 <1 E ny and 1 < j x nj, is defined similarly, with 
the coefficient of x7 equal to the number of shuffles 2 with d descents in the 


rÈ 


SHUFFLING DECKS WITH REPEATED CARDS 807 


second set which satisfy z (1) = Dfa (l, f) and x (N2) = D rs (j, h). Then the fol- 
lowing recursive relationship holds: 


(2.1) pi,jG) = YI gær, j G)x* 9. 
kl 


The indices k and / vary over 1 < k <n, and 1 € L x ny. The exponent e(k, L) is 
Oif D(k, e) « Dd, f) and 1 if D(k,e) > D(L, f). 

To prove (2.1), we consider a bijection between zr (1"!, 272, .... "^; D) and 
nz (11,..., e": Die) x n(f",..., h^; D fp). Let the shuffle zt map to the pair of 
shuffles (71, 72) under this yet to be defined bijection. If position i is occupied 
by a card labeled 6 in the starting deck 1"!, 272, ...,h"* and zr (i) = D(j, ô), then 
ni(i) = Dye(j, 8) if 1 x 8 x e and zo(i — N1) = Dep, ô) if f < ó < h, by de- 
finition of the bijection. To complete the proof of (2.1), we relate the number of 
descents of zr to the number of descents of xı and 72. The number of descents 
of z equals the sum of the number of descents of xı and m2 if zt (N1) = D(k, e), 
T (N1 +1)= Dd, f) and D(k,e) < D(L, f). However, if D(k, e) > D(L, f), the 
sum must be incremented by 1. 

The base case of the recurrence (2.1) occurs when the starting deck has cards 
of only one type. Take this deck to be 1". The coefficient of x^ in pi, j(x) is then 
equal to the number of shuffles z € S, with zr (1) =i, x(n) = j and with d de- 
scents. The number of shuffles x € S, with d descents is defined as the Eulerian 
number (^) [9]. Given a permutation of (1,2, ..., n — 1) with d or d — 1 descents, 
the number n can be inserted in d + 1 or n — d places, respectively, to obtain a 
permutation of (1, 2,...,5] with d descents. Thus, as shown in [9], consideration 
of the insertion of the number n into a permutation of the numbers 1, 2,...,n — 1 
gives the recurrence 


(]- «*»(;! )*e-o(571) if n » 0, 


est PE if d £0. 


The modified Eulerian number (7); is defined as the number of x € S, with 


7 (1) =i and d descents. If d = 0, (7); is O if i > 1 and 1 if i = 1. Considera- 


tion of the insertion of n into a permutation of the numbers 1,2,...,5 — 1 that 
begins with i gives the recurrence 


n n—1 n-—1 i 
(7) =a+D]| 4 ] *&-4-»(5-1) if n >i,d 90, 


n n—1 í . 
WS Ha ifn=i,d >Q. 


. Ifn=i=1, (7); is equal to 1 if d = 0 and equal to 0 if d > 0. The modified 
Eulerian number (’,);,; is defined as the number of z € S, with x (1) =i, a(n) = j 


(2.2) 


(2.3) 


808 M. CONGER AND D. VISWANATH 


and d descents. If d — 0, Cid is 1 if i = 1 and j =n but 0 otherwise. For d > 0, 
the following recurrence can be derived: 


n n— iÍ n—1 ; : 
ul," d ) t e-4-» (21), iin>i,n> J, 


n=l ; ; 

(2.4) G1), ifn=i,n> j, 
ee T PE 

"Mu n>i,n=j. 


If n — i — j — L and d 7 0, (^), — 0. Using (2.2), (2.3) and (2.4), the polynomi- 
als p; j; (x) can be formed in the base case. 

The descent polynomial of shuffles from 1"!, 272, ,.., 5"^ to D is obtained as 
the sum of the polynomials pj, ;(x) over 1 € i € n; and 1 € j X n. 

We now turn to the descent polynomial of shuffles z from D to the sorted deck 
171, 272. |. . , h"^, We first consider the occurrence of descents between x(k) and 
x(k + 1) when the positions k and k + 1 are occupied in D by cards with different 
labels. There will be a descent if and only if the label of the card at k is greater 
than the label of the card at k + 1. Thus, the number of descents of this type is the 
same for every shuffle from D to the sorted deck and is equal to the number of 
places where a card with a greater label immediately precedes a card with a lesser 
label in the deck D. This quantity, which may be denoted by des(D), is called the 
number of descents in D and is extensively studied in [10] and [12]. 

We next consider descents between x (k) and zt (k + 1) only if both positions k 
and k + 1 are occupied in D by cards with the label c. The cards at k and k + 1 
both have the label c if and only if k = D(i, c) and k + 1 = D(i + 1, c) for some 
integer i, 1 <i « ne. To facilitate the counting of this type of descent, denote 
the generating polynomial D of the Eulerian numbers by 5,(x) [10]. 
If we pay attention only to cards with label c in the deck D, it will look like 
*ck CCC * * * cc x x c With blocks of c’s separated by cards with different labels. 
Assume that the lengths of these blocks are given by m1, m2,..., my, with y being 
the number of blocks. Then nc = m + m2 +--+ my. Let 


c 
Je = Yon; 
j=l 


and let ip = je — nc + 1. If x is a shuffle from D to the sorted deck, then i, < 
z(D(i,c)) < je must hold for 1 <i < ne. The ne integers in [ic¢, Je] can be divided 
into sets of m1, m2, ..., my in nc!/(m|lm35!- -- m, !) ways. For each such division 
of these ne integers into sets, there are m m»! - -- m,! ways of assigning values to 
m(D(i,c)), 1 <i <ne, such that a number assigned to a position within the first 
block of cs is in the first set and so on. The coefficient of x^ of the polynomial 


SHUFFLING DECKS WITH REPEATED CARDS 809 


Nm (X) Nm (X) °° Nm, (x) is equal to the number of these assignments which have 
d descents. Therefore, the coefficient of x? of the polynomial 


nc! 
(2.5) pex) = 


miimi om, ™ (x) Nm (£) -+ Nm, (x) 


is equal to the number of assignments with d descents out of the no! assignments of 
integers in [i,, Je] to x (D(i, c)), 1 <i x nc. As intended, (2.5) counts the descent 
between 7 (k) and zt (k + 1) if and only if cards at positions k and k + 1 in D both 
have the label c. 

To find the descent polynomial of shuffles from D to the sorted deck, note that 
the occurrence of a descent between y (k) and x(k + 1), with cards labeled c at 
positions k and k + 1 in D, is completely independent of the occurrence of a 
descent between x (/) and zr (I + 1), with cards labeled d at positions / and / + 1, 
if c 4 d. Moreover, there are always des(D) descents in a shuffle 2 from D to the 
sorted deck that correspond to positions k and k + 1 occupied in D by cards with 
different labels. Therefore, the descent polynomial is given by 


(2.6) xD) py. (x) pa (x) - - - pa (x), 


where the pj (x) are defined by (2.5). 

If the deck D is any permutation of the multiset {17!,2”,...,h”*}, 
(2.1) and (2.6) make it possible to find the descent polynomials of shuffles from 
the sorted deck to D and of shuffles from D to the sorted deck in polynomial time. 
The descent polynomial of shuffles between decks neither of which is sorted will 
be considered in later work. 

In the rest of this section, we turn to theorems about transition probabilities 
between decks under an a-shuffle which do not use the descent polynomial. Let 
41,02, ...,05 be one of the a” integer sequences with 0 < a; <a for1-i <n. 
This sequence can be sorted to aj, < aj, € --- < aj, in a stable manner and 
the permutation 71,i2,...,i, of (1,2,..., 7} is uniquely determined since we re- 
quire i; < ij+1 if aj j ^ ijy Associate the shuffle x € S, with x(k) =i, for 
] < k <n with the sequence aj, a2, ..., an. Then the uniform distribution on the 
a” sequences induces the a-shuffle distribution on Sn [2]. This description of the 
a-shuffle is used in Theorems 2.1 and 2.2. 


THEOREM 2.1. Among all decks D that are permutations of the multiset 
(171, 272, ,, | A^ Y, the transition probability under an a-shuffle from the sorted 
deck 1”!,2”"2,..., h"^ to D is greatest for D = 1"1,272,..., h"^ and least for 
Den 2s p. 


PROOF. Assume the sorted deck to be 1", 2". The proof for more general 
sorted decks is similar. 


810 M. CONGER AND D. VISWANATH 


Let a1,a2,...,a25 be a sequence with 0 < a; <a for 1 <i < 2n. If D(i, 1) 2 j, 
define o; = aj, and if D(i,2) = j, define Bj = aj. For example, if D = 
1,2, 1,2, 1, 2, then 


41, 02, 43, 04, 05, A46 = 01, D1, a2, Bo, a3, P3. 


For the sequence to induce a shuffle from 1”, 2” to D, each œ; must be less than 
or equal to each fj. In addition, each o; must be strictly less than all the f's that 
precede it in the sequence. For example, if D — 1,2, 1, 2, 1, 2, the inequalities are 


max(o1, o2, 03) < min(f}, Bo, P3), a2 < By, a3 < Bi, a3 < Bo. 


If D = 1", 2”, it is enough if each o is less than or equal to each B. If D — 2", 1", 
each o must be strictly less than each f. Therefore, the number of sequences that 
induce a shuffle from 1",2" to D is greatest for D — 1",2" and least for D — 
2”, 1”. The statement about transition probabilities follows. 1 


Theorem 2.2 below generalizes Theorem 3 of [2] and their proofs use similar 
arguments. Similar arguments can also be found in [6] and [10]. 


THEOREM 2.2. Let the deck D be a permutation of the multiset (1,2,..., 
h, x"). Let the number of cards labeled c, 1 € c < h, that are not preceded by 
a card labeled c — 1 in D be equal to r. Let the number of cards labeled x that 
precede the card labeled h in D be equal to l. Then the probability that an a-shuffle 
applied to the sorted deck 1,2, ... , h, x" results in D is 


1 a—l = +h "X 
n BUT Jm - n'a - my. 


where ifl = 0 and m =a —1, (a —m — 1Y must be taken to be 1. 


PROOF. Letaj,a2,...,0444, be an integer sequence with 0 <a; <a for 1 < 
i <h+n. If D(l,c) =i, 1<c<h, define ge = a;i. If D(j,x) =k, 1x j xn, 
define 8; = ay. For the sequence a1, a2, ...,@,+4n to induce an a-shuffle from the 


sorted deck to D, we require 
(2.7) 0; <a2 €x: <a, <min(fj,..., Bn). 


, In addition, the inequality &c—1 < Œc, 2 < c < h, must be strict if the card labeled 
c in D is not preceded by the card labeled c — 1. Therefore, exactly r — 1 inequal- 
ities between the os in (2.7) are strict. Further, at least / of the fj, the ones with 
] <i <l, must be strictly greater than o. 

The number of solutions to (2.7), with the additional conditions described below 
it, can be counted by allowing œp = m to vary from r — 1 to a — 1. Given m, the 
number of ways to pick the œe, 1 < c < h, can be counted as follows. Start with 


m “jumps.” Allocate r — 1 of these jumps to the inequalities in ay < o2 € --- < 


SHUFFLING DECKS WITH REPEATED CARDS 81l 


O;,—1 € m that must be strict. The remaining m — r + 1 jumps can be assigned to A 


positions, namely, the position before œ and the  — 1 inequalities, in (” m 
ways. The value of aj, for 1 <i < h — 1, is equal to the number of jumps preceding 
it. The number of ways to pick the Bs is (a — m — 1Y (a — m)". The formula for 


the transition probability from the sorted deck to D follows. 1! 


3. Starting deck 1", 2". "The probability distribution over decks that are per- 
mutations of the same multiset of cards under an a-shuffle can be obtained from 
(2.1) or (2.6) if either the starting deck or the ending deck is in sorted order. The to- 
tal variation distance from the uniform distribution is a sum over all possible decks 
and its calculation can therefore involve a very large number of terms. However, 
the calculation becomes simpler if it is recognized that the transition probabilities 
are the same for whole classes of decks. In the case where all n cards have dif- 
ferent labels, the transition probabilities depend only upon the number of descents 
in the shuffle and, hence, the n! decks fall into only n equivalence classes. In this 
section we investigate this type of equivalence relationship when the starting deck 
is 1" 2". 

In this section and the next, we use a, f and y to denote sequences of 1’s and 2’s 
that stand for segments of a deck of cards. The number of entries in the se- 
quence o is denoted by |æ|. The sequence obtained by reversing the order of a 
and then replacing each 1 by 2 and each 2 by 1 is denoted a*. For example, if 
a —1,2,2,2, 1,1, then w* —2,2,1,1, 1,2. A total of 3 decks can be obtained 
by rearranging the cards of 1”, 2". The equivalence relation R on that set of decks 
is defined as follows. The deck D; = oy is R-related to Dz = aB* y if la] = ly, 
and the number of 1’s and the number of 2's in f are equal. For example, 1, 2, 2, 1 
is R-related to 2, 1, 1, 2 and 1, 1, 2, 2, 1,2 is R-related to 1, 2, 1, 1, 2, 2. The equiv- 
alence relation is obtained by taking the transitive, reflexive closure. For example, 
the decks 1,2,1,2,2, 1,2, 1 and 1,2,2,1,1,2,2,1 and 2, 1,1,2,2, 1, 1, 2 and 
2, 1,2, 1, 1, 2, 1, 2 are all in the same equivalence class. 


THEOREM 3.1. If D; is R-related to D», the transition probability from 1" , 2" 
to D; is equal to the transition probability from 1",2" to D» under an a-shuffle 
for any a. 


PROOF. Itis sufficient to consider Dı = «fy and D2 = af*y with |æ] = |y] 
and with equal number of 1’s and 2's in f. It is enough to show that the descent 
polynomial of shuffles from 1", 2" to D; is equal to the descent polynomial of 
shuffles from 1", 2" to D2. We will construct a bijective map from 7x (1^, 2”; D1) 
to zt (1", 2”; D2) such that a shuffle maps to another shuffle with exactly the same 
number of descents. 

The number of 1’s in o is equal to the number of 2’s in y since the number of 
I's and 2's are equal in aBy and in f. Similarly, the number of 2's in œ is equal 


812 M. CONGER AND D. VISWANATH 


to the number of 1’s in y. Let pj «--- < pg and q] «--- «gp and ri < + <r 


be the positions of D; that correspond to a and f and y, respectively, that are 


occupied by 1’s. Similarly, let p! <---< p} and qq < --- «qj and ri <---<r, 
be the positions of D; that correspond to œ and f and y, respectively, that are 
occupied by 2's. 

Define f(x) — 2n — x + 1. The map f reflects a position in a deck of size 2n 
about its center; for example, the first position is reflected to the last position. The 
positions occupied by 1’s in D; = of*y are — 


pj «c < pa < f(g) <- < f(Q) «n «- «re 


where ps and rs correspond to o and y, and f(q)s correspond to £*. In Di, 
the position qi is occupied by a 2. When f is reversed that 2 is moved to the 
position f(g;) and then it is replaced by 1 to form 6*. This explains the central 
block of f (q;)'s above. Similarly, the positions occupied by 2's in D» are 


pi << p< fq) << fq) <r <r, 


where the p’s correspond to positions in a, f (g)'s to positions in B* and r’’s to 
positions in y. Note that each p or p’ is less than each q or q’, which is less than 
each r orr’. 

Let x € $5, be a shuffle from 1”, 2" to Dj. Then the numbers 


z (1), x (2), ..., zt (n) 


must be an arrangement of the positions occupied by 1’s in Dj. Similarly, the 
numbers 


z (n4 1),z(n--2),...,x(2n) 


must be an arrangement of the positions occupied by 2's in Dj. The map to a 
shuffle from 1", 2" to D» is based on two cases. In the first case, we assume that 
not both x (n) and x(n + 1) correspond to positions in £. The shuffle z'* from 
1”, 2" to D2 that x maps to is defined as 


az" (i) —$G), 
for 1 < i x 2n, where $(-) will now be defined. First, we define $(pi) = pi, 
(ri) — ri, $(p;) = p; and $(r;) = r;. In addition, we define $ as 
qi > fb) afa) ef) 
qj > f(a) a,— f(ay-0Ü = a> fa). 


This definition maps the qs to the f (g^)'s and the q’’s to the f (q)'s and, therefore, 
z* is a shuffle from 1",2" to D». Further, x < y if and only if d(x) < $(y), 
except when x is ag and y is a q’ or when x is aq’ and y is a q. However, in the 
arrangement 


z(1),...,z (n), n (n4-1),...,x (2n), 


EJ 


d 


SHUFFLING DECKS WITH REPEATED CARDS 813 


a q and q' can occur in consecutive positions as zr (n) and zr (n + 1), and in no 
other way, and we have assumed that not both of those positions correspond to P. 
Therefore, the above arrangement has the same number of descents as 


$GQ)), .... óGr(), ó aln + 1), .--, 6G En) 


and m* has the same number of descents as 7r. 
The other case is when zr (n) is ag and zt (n + 1) is aq’. Then we define $ (qi) = 


| $ ó 
f (aD. (qj) = f (aj). pi <> raig andri <— p; j,4. We map x to x*, where 
7zt* is defined as 


m*(i) = é(r Qn — i + 1)), 


for 1 <i < 2n. It can be verified that zx* is a shuffle from 1",2" to D2. Further, 
x < y if and only if $ (x) > $(y), except when x is a p and y is a p’, orx isa 
p’ and y is a p, or x is an r and y is an r’, or x is an r’ and y is an r. In the 
arrangement, 


z(1),...,x (n), n 4- 1),..., x (2n), 


a p and p’ or an r and r’ can occur in consecutive positions only at x(n) and 
zt (n + 1). However, we have assumed that zr (n) is a q and that zz (n + 1) is a g’. 
Therefore, every descent in the above arrangement becomes an ascent in 


$ Qr), ..., 600). 6 Gr(n + 1)), .--, Pn) 


and every ascent becomes a descent. This arrangement is reversed to define 
7 *(1),...,z* (2n) which changes the ascents back into descents and, therefore, 
the number of descents in z* is equal to the number of descents in 7r. 

Finally, we need to show that the map defined above is a bijection. A shuffle 
from 1", 2" to D» can be mapped to a shuffle from 1", 2" to D, using the same 
procedure as above. The resulting map is the inverse of the above map because 
$ o $ is identity in both cases above. L 


It is natural to ask if the equality of the descent polynomials of the shuffles from 
1^, 2? to Dı and D» implies that Dj is R-related to D2. We have checked that this 
is indeed so for n = 1,2,3,4,5,6, 7. The theorem below counts the total number 
of equivalence classes under the relation R. 


THEOREM 3. AE The number of equivalence classes under R is equal to the 
Catalan number — -h (7 21 ). 


PROOF. We describe a method to find a unique representative for each equiv- 
alence class and then count the number of unique representatives. The function 
f (x) =2n — x + 1 reflects positions with respect to the center of the deck as be- 
fore. In this proof, we refer to f(x) as the reflection of the position x. A position 


814 M. CONGER AND D. VISWANATH 


and its reflection either both lie in f or lie outside it, since |æ] = |y |. Consider the 
positions x =n +1,n+2,...,2n. Assume that x and f(x) both lie inside £. If 
the positions x, f (x) are occupied by 1,2 in the deck D, reversal of B changes 
it to 2, 1, and the replacements of 1’s by 2’s and 2's by 1’s changes it back to 


1, 2. Similarly, application of the basic rule that generates the relation R does not 


change D in the positions x, f (x) if those positions are occupied by 2, 1. How- 
ever, if they are occupied by 1, 1, that becomes 2, 2 when the rule is applied using 


a p large enough to include positions x and f(x). Similarly, 2, 2 becomes 1, 1. i 


With each position x, we associate the symbol “+” if positions x, f (x) are oc- 
cupied by 1,2, the symbol “—” if occupied by 2, 1, the symbol 1 if occupied by 
1, 1, and the symbol 2 if occupied by 2, 2. The deck as a whole is coded as the 
list of symbols associated with positions n + 1 through 2n. For example, the deck 
1,1,2,1,2,2,2, 1, 1,2, 1, 2, 1, 2is coded as +, +, 2, 1,2, 1, —. 

The +s and —s never change when the rule that generates the relation R is 
repeatedly applied with possibly many different choices of B. They are ignored 
in much of the rest of this proof. We can find the B's which lead to a nontrivial 
application of the rule to generate the relation R using the code for the deck D 
as follows. We traverse the code from left to right, and record the excess of 1’s 
over 2's. For example, for the code —, —, 2, 1, 2, 1, +, this excess is —1 after the 
first 2 is passed, then becomes 0, and then —1, and then 0. The rule for generating 
R can be applied whenever this excess becomes 0. If the excess becomes zero, after 
traversing i symbols in the code, the corresponding f in the deck is a segment 
of 2i cards extending from position n — i + 1 to position n + i. When the rule 
is applied, the 1’s become 2’s and the 2's become 1’s among the first i symbols 
of the code. If the excess becomes 0 after i symbols and then again after i + j 
symbols, the application of the rule with a 8 of length equal to 2i followed by 
another application using a £ of length 2(i + j) changes the code for the deck only 
between the (i + 1)st and the jth symbol. Among these symbols, the 1's change 
to 2's and the 2’s change to 1’s. By applying the rule with judicious choices of £, 
it is possible to obtain a single code in which the excess never becomes negative. 
For example, the code —, —,2, 1, 2, 1, + can be converted to —, —, 1, 2, 1, 2, +. 
We use such codes as unique representatives of equivalence classes of decks. 

Assume that in such a code, there are k symbols equal to 1 and k symbols equal 
to 2. Then there must be n — 2k symbols equal to + or —. If the + and — symbols 
are ignored, and each 1 is substituted by a ( and each 2 by a), we obtain a valid 
arrangement of parentheses of length 2k. The number of valid arrangements of 
parentheses of length 2k is well known to be the Catalan number EH (45 For 
each assignment of 1's and 2’s to the 2k positions, the other positions can be filled 
with symbols + and — in 2”~* ways. The 2k positions that are assigned either the 
symbol 1 or the symbol 2 can be chosen in (5,) ways. Therefore, the total number 


MM > 


LE ENSIS. 


SHUFFLING DECKS WITH REPEATED CARDS 815 


of equivalence classes is given by Y, aie CA) (2). This sum can be simplified: 


Ef (5 Os (2) 2") 


sra CBS OP) 


The first equality above Uses (5.35) in [9]. The proof may be completed using the 


binomial identity 5^, (,, m ( p "mL d for integers /, m, p and I > 0. The 


cases with n even and odd have to be considered separately. |] 


4. Starting deck (1, 2)". The equivalence relation R in this section is dif- 
ferent from the one considered in the previous section. In this section D; = aBy 
is R-related to D = ofi*y if B has the same number of 1’s as 2's. The addi- 
tional condition |o | = |y] is no longer required. The decks Dj are all permutations 
of (1", 2"), The equivalence relation is obtained by taking the transitive, reflexive 
closure. The equivalence class containing 1, 2, 1, 2, 2, 1, 2, 1 has five other decks 
in it. 


THEOREM 4.1. Jf D, is R-related to D», the transition probability from 
(1, 2)" to Dı is equal to the transition probability from (1,2)" to D» under an 
a-shuffle for any a. 


PROOF. It is sufficient to consider Dı = aBy and D = aB*y with equal 
number of 1’s and 2’s in p. We will construct a bijective map from sr ((1, 2)"; D1) 
to zx ((1, 2)”; D2) such that a shuffle maps to another shuffle with exactly the same 
number of descents. 

Let x € w((1, 2)"; Dj). Let x(2i — 1) = aj, lxi <n, and zi) = b;. 
] €i €n. The number of descents in x is equal to the number of descents in 
the arrangement a1, b1, a2, b», ..., an, by. To facilitate the proof, we depict this 
arrangement in the following way: 


Qj Q2 Gy, 
^41 ^ T 
LAUD] Fue LO | 
a“ P d Fag 
A | 2^ lv 
id " 
bi bo bn 


In the deck D; each position a; is occupied by a 1 and each position b; is occupied 
by a 2 [because x is a shuffle from (1, 2)" to Dı]. We assume that 8 begins at the 
(i + 1)st position and ends at the (i + 27)th position. In the depiction above, circle 
all the a; and b; that do not correspond to P, that is, circle an a; or a bj if it is 


816 M. CONGER AND D. VISWANATH 


less than i+ 1 or greater than i + 27. When x is mapped, the circled numbers will 
stay fixed. The uncircled numbers form segments that run from a circled number 
to another circled number (or they might begin or end at a; or bn). These segments 
of uncircled numbers are of four types according as they either begin or end at the 
top line or the bottom line: 


WW ww ut 


The third type of segment has one more uncircled position in the bottom line, 
corresponding to a position occupied by a 2 in Dj, than in the top line. The fourth 
type of segment has an extra uncircled position in the top line, corresponding to 
a position occupied by a 1 in Dj. Since the number of 1’s in f is equal to the 
number of 2's, the number of uncircled positions in the top line must be equal 
to the number of uncircled positions in the bottom line. Therefore, the number 
of uncircled segments of the third type must be equal to the number of uncircled 
segments of the fourth type. 

To map z to a shuffle from (1, 2)" to Do, we will modify the uncircled segments 
and insert them back in-between the circled numbers in the original arrangement 
of a; and bj. We define a map f from the uncircled positions, that is, the positions 
that correspond to P, back to the the uncircled positions as follows: 


TIETES OS Jede oec E 2E Se grad 


Ifi+1<x <i+2) and the position x in Dı is occupied by 1 (or 2), the position 
f (x) in D2 will be occupied by 2 (or 1). If ap, bp, ap4.1, bp4i,..., bg is an uncir- 
cled segment of the first type, it will be modified to f (b4), f (ag). .... f (ap+1), 
f (bp). f (ap). In the deck D», each position f (bi) [or f(ai)], p € i € q, is oc- 
cupied by 1 (or by 2). Therefore, the modified segment is also of the first type. 
However, when an uncircled segment of the third (or fourth) type is modified 
in this way, it becomes a segment of the fourth (or third) type. The arrangement 
a1, b1, @2, b2,..., An, bn can be converted to another arrangement in the following 
steps: 


1. Extract the uncircled segments of the first and second type from the arrange- 
ment, modify them as described above, and put the modified segment back in 
the same place. 

2. Number the uncircled segments of the third and fourth type from left to right. 
As explained above, they must be equally numerous. 

3. Replace the ith uncircled segment of the third type by the modification of the 
ith uncircled segment of the fourth type. Similarly, replace the ith uncircled 
segment of the fourth type by the modification of the ith uncircled segment of 
the third type. 





—— 


eL Rn 


SHUFFLING DECKS WITH REPEATED CARDS 817 


The shuffle 2* is defined by setting z'* (i) equal to the ith number in the arrange- 
ment constructed in this manner. By construction, * is a shuffle from (1, 2)" 
to D>. Further, the number of descents of 2* must equal the number of descents of 
x for the following reason. If there is a descent or an ascent between two circled 
positions in the arrangement a1, b1, a2, b2, ..., An, bn, it remains unchanged. Also, 
the modification of uncircled segments described above preserves the number of 
descents, although it changes their locations. Finally, if a circled number is greater 
than (or less than) a single uncircled number, it must be greater than (or less than) 
all uncircled numbers and, therefore, the number of descents between circled and 
uncircled numbers in the arrangement also remains unchanged. 

It is possible to map a shuffle from (1, 2)" to D» to a shuffle from (1, 2)" to Dy 
using the same procedure. That map would be the inverse of the map defined above. 
Therefore, the map from shuffles x to shuffles 2* defined above is a bijection. |] 


The converse of the above Theorem 4.1 appears to be true as well. The number 
of equivalence classes seems to be given by the simple formula (n + 3)2”—. One 
can attempt to prove this by finding unique representatives for equivalence classes 
and then counting them as in the proof of Theorem 3.2. We have derived a method 
to construct unique representatives for equivalence classes of R. However, we have 
not yet devised a method to count the number of unique representatives. 


S. Card games. Some inferences about the mixing times for common card 
games such as blackjack and bridge can be drawn using results given in the pre- 
ceding sections. Let S be a finite set and let p be a probability distribution on S. 
Then the total variation distance of p from the uniform distribution 1s given by 
DE slp(s) — m. For a deck of 52 distinct cards, the total variation distance 
remains close to 1 until the number of riffle shuffles exceeds 4. The total variation 
distance falls below 0.5 when the number of shuffles is 7 and this can be taken to 
be the mixing time [2]. Another distance defined in [1] is the separation distance. 
The separation distance of p from the uniform distribution is max;es(1 — | S| p(s)). 
Like the total variation distance, the separation distance has a maximum of 1 and 
a minimum of 0. However, it leads to a more demanding notion of mixing as the 
number of riffle shuffles of a deck of 52 distinct cards needed to make the separa- 
tion distance no more than 1/2 is 11. The use of entropy to understand mixing is 
discussed in [14]. The validity and limitations of the maximum entropy model of 
riffle shuffles are discussed in [3] and [5]. 

In the game of bridge, 52 distinct cards are dealt to four players. To apply the 
results proved in Section 2, we need to assume that the first 13 cards are dealt to 
one player, the next 13 to another and so on. Let the deck D be a permutation 
of the multiset (12, 25, 3D, 413). Let pp be the transition probability from D 
to 112,275, 315, 413 under an a-shuffle. This transition probability can be obtained 
using (2.6). The probability that the first player is dealt cards originally in the 
positions occupied by cards labeled 1 in D, that the second player is dealt cards 


818 M. CONGER AND D. VISWANATH 


originally in the positions occupied by cards labeled 2 in D, and so on after an 
a-shuffle is equal to pp. Therefore, the distance of the probability distribution pp 
over decks from the uniform distribution will indicate the closeness of a deal after 
an a-shuffle to a random deal to four players. If the separation distance is used to 
define mixing, an application of (2.6) with D = (4, 3, 2, 1)? shows that the sep- 
aration distance is greater than 0.5 after 10 riffle shuffles. Therefore, the mixing 
time is still 11 riffle shuffles. The total variation distance involves a sum with a 
great number of terms and the results of Sections 3 and 4 indicate that the recog- 
nition of equalities among the transition probabilities pp is unlikely to make this 
sum tractable. However, a Monte Carlo procedure for evaluating this sum, which 
will be described elsewhere, implies that the mixing time is 6 riffle shuffles when 
the total variation distance is used. If the cards are dealt to the players in cyclic 
order, which is the common practice, the mixing times will almost certainly be 
lower. 

In the game of blackjack, the distinction between the suits is ignored. We as- 
sume the starting deck to be 14, 2^, ..., 13^. Application of Theorem 2.1 and (2.1) 
shows that the separation distance from the uniform distribution over decks be- 
comes less than 0.5 after 9 riffle shuffles. Again, a Monte-Carlo procedure has 


to be employed to find the total variation distance. It then follows that the total. 


variation distance becomes less than 0.5 after only 4 riffle shuffles. 


Acknowledgments. The authors thank Professors P. Diaconis, J. Fulman, 
S. Lalley, C. Mulcahy and J. Rauch for helpful discussions. 


REFERENCES 


[1] ALDOUS, D. and Draconis, P. (1986). Shuffling cards and stopping times. Amer. Math. 
Monthly 93 333-348. MRO841111 

[2] BAYER, D. and DIACONIS, P. (1992). Trailing the dovetail shuffle to its lair. Ann. Appl. Probab. 
2 294-313. MR1161056 

[3] DIACONIS, P. (1988). Group Representations in Probability and Statistics. IMS, Hayward, CA. 
MR0964069 

[4] Diaconis, P., MCGRATH, M. and PITMAN, J. (1995). Riffle shuffles, cycles and descents. 
Combinatorica 15 11-29, MR1325269 

[5] EPSTEIN, E. (1977). The Theory of Gambling and Statistical Logic, rev. ed. Academic Press, 
New York. 

[6] FOATA, D. and SCHÜTZENBERGER, M.-P. (1970). Théorie Géométrique des Polynómes Eu- 
lériens. Springer, Berlin. MR0272642 

[7] FULMAN, J. (1998). The combinatorics of biased riffle shuffles. Combinatorica 18 173—184. 
MR1656538 

[8] FULMAN, J. (2002). Applications of symmetric functions to cycle and increasing subsequence 
structure after shuffles. J. Algebraic Combin. 16 165-194. MR1943587 

[9] GRAHAM, R. L., KNUTH, D. E. and PATASHNIK, O. (1994). Concrete Mathematics, 2nd ed. 
Addison-Wesley, Reading, MA. MR1397498 

[10] KNUTH, D. E. (1998). The Art of Computer Programming 3, 3rd ed. Addison-Wesley, Read- 

ing, MA. MR0378456 


Ĉ. ee 
vom yet TT etu 4 ‘ 
i 
: "EE 


TU am. 
7 adi AN 
7 PRA 

4 





a e? - 
TF DAE T E. 


M 
a 
ee LION 


ium Se 
* - - . 
- veh " ba 


Py 


pr ae ERE! 
foo 


ta Ae 
i |a 


SHUFFLING DECKS WITH REPEATED CARDS 819 


[11] LALLEY, S. P. (1996). Cycle structure of riffle shuffles. Ann. Probab. 24 49—73. MR1387626 

[12] MACMAHON, P. A. (1915). Combinatory Analysis 1. Cambridge Univ. Press. 

[13] REYES, J.-C. U. (2002). Random walks, semidirect products, and card shuffling. Ph.D. disser- 
tation, Stanford Univ. 

[14] TREFETHEN, L. N. and TREFETHEN, L. M. (2000). How many shuffles to randomize a deck 
of cards? Proc. Roy. Soc. Lond. Ser. A 456 2561—2568. MR1796496 


DEPARTMENT OF MATHEMATICS 

UNIVERSITY OF MICHIGAN 

525 E. UNIVERSITY AVENUE 

ANN ARBOR, MICHIGAN 48109 

USA 

E-MAIL: mconger@umich.edu 
divakar 2 umich.edu 


i 5 " sibi ipid asii Np e lm dim Badii gininslniino icti i mm m em RÀ = = - : 


The Annals of Probability 
Vol. 34 May 2006 No. 3 


Articles 


Shortest spanning trees and a counterexample for random walks in random environments 
MAURY BRAMSON, OPER ZEITOUNI AND MARTIN P. W. ZERNER 


Uniqueness of maximal entropy measure on essential spanning forests... ... SCOTT SHEFFIELD 
Ends in free minimal spanning forests 00.0000... cece cece cece ete eee eee e ees ADAM TIMAR 
On the transience of processes defined on Galton- Watson trees ..... ANDREA COLLEVECCHIO 
Local limit of labeled trees and expected volume growth 

in a random quadrangulation....... PHILIPPE CHASSAING AND BERGHNNUR DURHCUS 
Subiree prune and regraft: A reversible real tree-valued 

Matkov DIOCOS S ei cisco EXPE Eee en qa STEVEN N. EVANS AND ANITA WINTER 
Extremes of the discrete two-dimensional Gaussian free field......... iii OLIVIER DAVIAUD 
Carne- Varopoulos bounds for centered random walks ............0.-..60-. PIERRE MATHÍFU 
Weak convergence of positive self-similar Markov processes and overshoots 

ot LEVY DIOCESSER Loic xL UEPRR RS S Y M. E. CABALLERO AND L. CHAUMON: 
On the absolute continuity of Lévy 

processes with drift .... T———Q USE TEES . IVAN NOURDIN AND THOMAS SIMOS 


Traces of symmetric Markov processes and their characterizations 
ZHEN-QING CHEN, MASATOSHI FUKUSHIMA AND JIANGANG YING 


Skew convolution semigroups and affine Markov processes D. A. DAWSON AND ZENGHU Ej 
Concentration inequalities and asymptotic results for ratio type 
empirical processes... 6. ee eae EVARIST GINÉ AND VLADIMIR KOLTCHINSKTI! 
Martingale structure of Skorohod integral processes 
GIOVANNI PECCATI, MICHELE THIEULLEN AND CIPRIAN A. TUDOR 


ER ERR EN RR RR TM rne cc mm meme met my imeem damp e HN My RH a MM pm PHPR NERA MA sime tear (€— PEST RII S ae a Set P CM war = m= = æ e 


